Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18067 Über-jira: S3A Hadoop 3.3.5 features
  3. HADOOP-12020

Support configuration of different S3 storage classes

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.0
    • 3.3.5
    • fs/s3
    • Hadoop on AWS

    Description

      Amazon S3 uses, by default, the NORMAL_STORAGE class for s3 objects.
      This offers, according to Amazon's material, 99.99999999% reliability.
      For many applications, however, the 99.99% reliability offered by the REDUCED_REDUNDANCY storage class is amply sufficient, and comes with a significant cost saving.

      HDFS, when using the legacy s3n protocol, or the new s3a scheme, should support overriding the default storage class of created s3 objects so that users can take advantage of this cost benefit.

      This would require minor changes of the s3n and s3a drivers, using
      a configuration property fs.s3n.storage.class to override the default storage when desirable.

      This override could be implemented in Jets3tNativeFileSystemStore with:
      S3Object object = new S3Object(key);
      ...
      if(storageClass!=null) object.setStorageClass(storageClass);

      It would take a more complex form in s3a, e.g. setting:
      InitiateMultipartUploadRequest initiateMPURequest =
      new InitiateMultipartUploadRequest(bucket, key, om);
      if(storageClass !=null )

      { initiateMPURequest = initiateMPURequest.withStorageClass(storageClass); }

      and similar statements in various places.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            monthonk Monthon Klongklaew
            ylandrin Yann Landrin-Schweitzer
            Votes:
            2 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3h 50m
                3h 50m

                Slack

                  Issue deployment