Athena에서 s3 glacier에 저장된 객체에 대한 쿼리 가능여부

2020-12-02

.

Data_Engineering_TIL(20201202)

AWS support cases로 공부한 내용입니다.

[학습내용]

결론 : Athena can query S3 objects that are restored from Glacier.

–> 가능은한데 중간에 glacier로부터 복원하는 과정이 필요하다.

상세내용 :

According to official documentation [1], Athena does not support querying the data in the GLACIER storage class. Also as you might know, when you restore a glacier object, Amazon S3 restores a temporary copy of the object only for the specified duration.

After that, it deletes the restored object copy [2]. This means even if the object has been restored from Glacier and can be accessible in real time, the object is still in Glacier storage class.

If you want to make Athena query work with the GLACIER objects, please transition the object back to the Amazon S3 Standard storage class. You will need to first restore each object individually, and then change the storage class to ‘Standard’ by running a copy operation either by overwriting the existing object or copying the object into another location [3]. Please see the section “Change the object’s storage class to Amazon S3 Standard” in documentation link [3].

To check if the object was restored successfully or not and its storage class, please use the command below:

$ aws s3api head-object –bucket abcde –key objectname

Sample output:

{
    "AcceptRanges": "bytes",
    "Restore": "ongoing-request=\"false\", expiry-date=\"Mon, 19 Nov 2020 00:00:00 GMT\"",
    "LastModified": "2020-04-01T00:01:03+00:00",
    "ContentLength": 258621,
    "ETag": "\"b4846a06d1dd903ccdf5b4a00b56fb94-1\"",
    "ContentType": "",
    "ServerSideEncryption": "AES256",
    "Metadata": {},
    "StorageClass": "GLACIER"
}

References:

[1] Creating Tables in Athena - Requirements for Tables in Athena and Data in Amazon S3 - https://docs.aws.amazon.com/athena/latest/ug/creating-tables.html#s3-considerations

[2] Restoring Archived Objects - https://docs.aws.amazon.com/AmazonS3/latest/dev/restoring-objects.html

[3] How can I restore an S3 object from the Amazon Glacier storage class using the AWS CLI? - https://aws.amazon.com/premiumsupport/knowledge-center/restore-s3-object-glacier-storage-class/