Monday, July 13, 2020

S3 - Tiers, Glacier, Lifecycle policies

Availability vs. Durability

-         Availability and durability are two very different aspects of data accessibility. 
-         Availability refers to system uptime, i.e. the storage system is operational and can deliver data upon request. Historically, this has been achieved through hardware redundancy so that if any component fails, access to data will prevail. 
-         Durability, on the other hand, refers to long-term data protection, i.e. the stored data does not suffer from bit rot, degradation or other corruptionRather than focusing on hardware redundancy, it is concerned with data redundancy so that data is never lost or compromised

S3 TIERS
-         https://aws.amazon.com/s3/storage-classes/
-         Storage class - can be selected when uploading
-         Can change class when setting lifecycle configuration or replication configuration
-         Frequently accessed:
          §  STANDARD – 11 9's% durability over, >= 3AZ
          §  REDUCED_REDUNDANCY, 1AZ
-         Infrequently accessed:
          §  STANDARD_IA – good for DR, Infrequent Access
          §  ONAZONE_IA - 1AZ, can store data and replicate into another AZ
-         Optimized:
          §  INTELLIGENT_TIERING – if data is not accessed over 30 days, move to IA. Can't move data form INTELLIGENT to STANDARD
-         Archive:
          §  GLACIER – min storage 90 days, fastest 1-5 min retrieval
          §  GLACIER DEEP_ARCHIVE – min storage 180 days
-         Can’t go from S3-IA to S3-STRD or S3-RRS

Amazon S3 storage class waterfall graphic.
Glacier
-         https://docs.aws.amazon.com/amazonglacier/latest/dev/introduction.html
-         Amazon S3 Glacier is a secure, durable, and extremely low-cost Amazon S3 storage class for data archiving and long-term backup
-         Synchronous upload, asynchronous retrieval
-         Byte ranges – retrieval in multiples of 1MB. Specify FROM byte and TO byte in HTTP GET message; useful when data is zipped and you need just part of it.
-         Glacier Select – can pass in SELECT statements with API calls to filter the search
-         Retrieval – asynchronous - first initiate a job and then download the job output
-         
Upload – synchronous. Since glacier upload to multiple facilities for redundancy, the ACK will only be sent once all locations are updated

-         Amazon S3 Glacier offers a 10 GB retrieval free tier. You can retrieve 10 GB of your Amazon S3 Glacier data per month for free. The free tier allowance can be used at any time during the month and applies to Standard retrievals


Lifecycle policies
-         a sub-resource
-         applied to bucket, bucket folder, objects w specific tag or prefix
-         Actions:
-         Transitionmove to another tier
          §  From S3 Standard to S3-IA - Min after 30 days in Standard; and then in it has to remain in IA for another 30 days
          §  Min size is 128KB
          §  Cannot have a policy to move from Glacier to S3-Std/S3-IA. Workaround – extract to RRS (Reduced Redundancy), then copy out to S3. Can’t go into RRS from S3Stnd or S3-IA
-         Expirationdelete

-         


No comments:

Post a Comment