Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.7.0
-
Hadoop on AWS
Description
Amazon S3 uses, by default, the NORMAL_STORAGE class for s3 objects.
This offers, according to Amazon's material, 99.99999999% reliability.
For many applications, however, the 99.99% reliability offered by the REDUCED_REDUNDANCY storage class is amply sufficient, and comes with a significant cost saving.
HDFS, when using the legacy s3n protocol, or the new s3a scheme, should support overriding the default storage class of created s3 objects so that users can take advantage of this cost benefit.
This would require minor changes of the s3n and s3a drivers, using
a configuration property fs.s3n.storage.class to override the default storage when desirable.
This override could be implemented in Jets3tNativeFileSystemStore with:
S3Object object = new S3Object(key);
...
if(storageClass!=null) object.setStorageClass(storageClass);
It would take a more complex form in s3a, e.g. setting:
InitiateMultipartUploadRequest initiateMPURequest =
new InitiateMultipartUploadRequest(bucket, key, om);
if(storageClass !=null )
and similar statements in various places.
Attachments
Issue Links
- breaks
-
HADOOP-18292 s3a storage class reduced redundancy breaks s3 select tests
- Open
- causes
-
HADOOP-18371 s3a FS init logs at warn if fs.s3a.create.storage.class is unset
- Resolved
-
HADOOP-18339 S3A storage class option only picked up when buffering writes to disk
- Resolved
- depends upon
-
HADOOP-13050 Upgrade to AWS SDK 1.11.45
- Resolved
- is duplicated by
-
HADOOP-14326 S3A to support S3 reduced redundancy storage
- Resolved
-
HADOOP-16259 Distcp to set S3 Storage Class
- Resolved
- is related to
-
HADOOP-16259 Distcp to set S3 Storage Class
- Resolved
- relates to
-
HADOOP-14837 Handle S3A "glacier" data
- Open
-
HADOOP-18339 S3A storage class option only picked up when buffering writes to disk
- Resolved
-
HADOOP-18281 Tune S3A storage class support
- Open
- links to