AWS S3 Glacier
S3 Glacier is a storage solution that’s optimized for archived, infrequently used data or “cold” data.
S3 Glacier is a highly secure, durable, low-cost storage solution for data archiving or long-term backup.
S3 Glacier is designed for an average annual durability of 99.999999999% to an archive.
S3 Glacier stores redundant data in multiple facilities on multiple devices.
Glacier syncs data across multiple facilities to increase durability. Then, Glacier returns SUCCESS upon uploading archives.
Glacier is designed to automatically heal itself and performs regular, systematic data integrity tests.
Glacier allows customers to outsource the administrative burden of scaling and operating storage to AWS without having to worry too much about data replication, capacity planning, hardware failure detection and recovery, or time-consuming hardware migratings.
Glacier is a great choice for storage when cost is important. With data that is rarely retrieved and retrieval latency of several minutes, Glacier is an acceptable storage option.
Glacier now offers data retrieval options that take anywhere from hours to just a few minutes.
If applications require frequent, rapid access to data in real-time, S3 should be used
S3 Glacier can store almost any type of data in any format.
Glacier allows interaction via AWS Management Console, Command Line Interface CLI/SDKs, or REST-based APIs.
Rest of the operations to upload, retrieve data, create jobs for retrieval, and create them require CLI, SDK, or REST-based APIs
Use cases includeDigital media archives
Data that must be kept for regulatory compliance
Financial and healthcare records
Raw genomic sequence data
Long-term database backupsS3 Glacier Data Model
Glacier data model core concepts vaults and archives, as well as job and notification configuration resourcesVaultA vault refers to a container used for storing archives
Each vault resource has an unique address. This addresses the region where the vault was created as well as the vault name in that region. https://glacier.us-west-2.amazonaws.com/111122223333/vaults/examplevault
Vault allows unlimited storage
Glacier supports vault operations that are specific to each region.
AWS accounts can create up to 1000 vaults per region.
ArchiveAn archive is any data, such as a photograph, video or document. It is a base unit for storage in Glacier.
Each archive has a unique ID, and optional description. These can only be specified during upload.
Glacier assigns an ID to the archive, which is unique in each AWS region where it is stored.
A single request can upload an archive. Glacier offers a multipart upload API which allows you to upload large archives in multiple requests.
An Archive can contain up to 40TB.
JobsA job is needed to retrieve the vault inventory list and archive inventory list
Data retrieval requests, which are asynchronous operations and are queued, take approximately four hours to complete.
After the job is completed, a job is initiated.
Vault inventory jobs require the vault name
Data retrieval jobs require both the vault name as well as the archive ID, with an optional description
A vault can have multiple jobs at any given time. This can be identified with a Job ID, which is assigned when it was created for tracking.
Glacier keeps job information, such as job type and description, creation date, completion dates, and job status, and can be consulted
The job output can be downloaded either in its entirety or in part after it is completed.
Glacier Supports Notification ConfigurationAs the jobs can be completed simultaneously, Glacier supports a notification mechanism to an SNS subject when the job is complete.
Notifications via SNS can be included with job requests or in the vault.
Glacier stores