Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15620

Über-jira: S3A phase VI: Hadoop 3.3 features

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.2.0
    • Fix Version/s: 3.3.0
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:
    • Release Note:
      Hide
      Lots of enhancements to the S3A code, including
      * Delegation Token support
      * better handling of 404 caching
      * S3guard performance, resilience improvements
      Show
      Lots of enhancements to the S3A code, including * Delegation Token support * better handling of 404 caching * S3guard performance, resilience improvements

      Attachments

        Issue Links

        1.
        s3a create(overwrite=true) to only look for dir/ and list entries, not file Sub-task Resolved Steve Loughran  
        2.
        S3A init hangs if you try to connect while the system is offline Sub-task Resolved Unassigned  
        3.
        S3AInputStream to implement CanUnbuffer Sub-task Resolved Sahil Takiar  
        4.
        S3A: Consider using TransferManager.download for copyToLocalFile Sub-task Resolved Unassigned  
        5.
        Parallelize S3A directory deletes Sub-task Resolved Unassigned  
        6.
        Add support for S3 Select to S3A Sub-task Resolved Steve Loughran  
        7.
        S3A to support Delegation Tokens Sub-task Resolved Steve Loughran  
        8.
        Stabilise/formalise the JSON _SUCCESS format used in the S3A committers Sub-task Resolved Unassigned  
        9.
        Add S3A implementation of FSMainOperationsBaseTest Sub-task Resolved Steve Loughran  
        10.
        S3a to support get/set permissions through S3 object tags Sub-task Resolved Unassigned  
        11.
        S3a rename() to copy files in a directory in parallel Sub-task Resolved Unassigned  
        12.
        Add HTrace to the s3a connector Sub-task Resolved Madhawa Kasun Gunasekara  
        13.
        S3A should allow renaming to a pre-existing destination directory to move the source path under that directory, similar to HDFS. Sub-task Resolved Unassigned  
        14.
        fs -expunge to take a filesystem Sub-task Resolved Shweta  
        15.
        S3AFileSystem silently deletes "fake" directories when writing a file. Sub-task Resolved Unassigned  
        16.
        S3A Retry policy to retry on NoResponseException Sub-task Resolved Steve Loughran  
        17.
        S3A client raising ConnectionPoolTimeoutException in test Sub-task Resolved Unassigned  
        18.
        Bulk commits of S3A MPUs place needless excessive load on S3 & S3Guard Sub-task Resolved Steve Loughran  
        19.
        S3A log message on rm s3a://bucket/ not intuitive Sub-task Resolved Gabor Bota  
        20.
        S3aUtils.getEncryptionAlgorithm() always logs@Debug "Using SSE-C" Sub-task Resolved Unassigned  
        21.
        S3A warning of obsolete encryption key which is never used Sub-task Resolved Unassigned  
        22.
        add s3guard CLI command to generate session keys for an assumed role Sub-task Resolved Steve Loughran  
        23.
        S3AFileSystem.verifyBucketExists to move to s3.doesBucketExistV2 Sub-task Resolved lqjacklee  
        24.
        FileSystemMultipartUploader should verify that UploadHandle has non-0 length Sub-task Resolved Ewan Higgs  
        25.
        Memory leak in S3AOutputStream Sub-task Resolved Steve Loughran  
        26.
        [s3a] stop treat fs.s3a.max.threads as the long-term minimum Sub-task Resolved Sean Mackrory  
        27.
        S3 listing inconsistency can raise NPE in globber Sub-task Resolved Steve Loughran  
        28.
        remove obsolete S3A test ITestS3ACredentialsInURL Sub-task Resolved Steve Loughran  
        29.
        S3A input stream to use etags/version number to detect changed source files Sub-task Resolved Ben Roling  
        30.
        Move ITestS3AMiniYarnCluster to S3A committers Sub-task Resolved Steve Loughran  
        31.
        @Retries annotation of putObject() call & uses wrong Sub-task Resolved Steve Loughran  
        32.
        Review + update cloud store sensitive keys in hadoop.security.sensitive-config-keys Sub-task Resolved Steve Loughran  
        33.
        ITestS3AContractMultipartUploader#testMultipartUploadEmptyPart test error Sub-task Resolved Ewan Higgs  
        34.
        Some S3A committer tests don't match ITest* pattern; don't run in maven Sub-task Resolved Steve Loughran  
        35.
        get patch for S3a nextReadPos(), through Yetus Sub-task Resolved lqjacklee  
        36.
        Hadoop aws does not use shaded jars Sub-task Resolved Unassigned  
        37.
        Oozie unable to create sharelib in s3a filesystem Sub-task Resolved Steve Loughran  
        38.
        S3A committers: make sure there's regular progress() calls Sub-task Resolved lqjacklee  
        39.
        S3AFileSystem.verifyBucketExists to move to s3.doesBucketExistV2 Sub-task Resolved lqjacklee  
        40.
        Add bouncycastle jars to hadoop-aws as test dependencies Sub-task Resolved Steve Loughran  
        41.
        [DOC] Effective use of FS instances during S3A integration tests Sub-task Resolved Gabor Bota  
        42.
        hamcrest-library declaration in hadoop-aws to be scoped test Sub-task Resolved Steve Loughran  
        43.
        S3A SSL connections should use OpenSSL Sub-task Resolved Sahil Takiar  
        44.
        S3A tests to include Terasort Sub-task Resolved Steve Loughran  
        45.
        Token.toString faulting if any token listed can't load. Sub-task Resolved Steve Loughran  
        46.
        Move DurationInfo from hadoop-aws to hadoop-common org.apache.hadoop.util Sub-task Resolved Abhishek Modi  
        47.
        Parquet reading S3AFileSystem causes EOF Sub-task Resolved Steve Loughran  
        48.
        Update AWS SDK to 1.11.563 Sub-task Resolved Steve Loughran

        0%

        Original Estimate - 24h
        Remaining Estimate - 24h
        49.
        Extend documentation in testing.md about endpoint constants Sub-task Resolved Adam Antal  
        50.
        regression: ITestS3AMiniYarnCluster failing on branch-3.2 Sub-task Resolved Unassigned  
        51.
        S3Guard to add DynamoDBLocal Support Sub-task Resolved lqjacklee  
        52.
        S3A copyFile operation to include source versionID or etag in the copy request Sub-task Resolved Steve Loughran  
        53.
        S3A MarshalledCredentials.toString() doesn't print full date/time of expiry Sub-task Resolved Steve Loughran  
        54.
        S3AUtils.translateException to map CredentialInitializationException to AccessDeniedException Sub-task Resolved Steve Loughran  
        55.
        S3AFileSystem#innerMkdirs builds needless lists Sub-task Resolved Lokesh Jain  
        56.
        warning about user:pass in URI to explicitly call out Hadoop 3.2 as removal Sub-task Resolved Steve Loughran  
        57.
        Improved S3A MR tests Sub-task Resolved Steve Loughran  
        58.
        S3AFileStatus to declare that isEncrypted() is always true Sub-task Resolved Steve Loughran  
        59.
        S3A delegation tests fail if you set fs.s3a.secret.key Sub-task Resolved Unassigned  
        60.
        S3A Etag tests fail with default encryption enabled on bucket Sub-task Resolved Ben Roling  
        61.
        S3A Delegation Token code to spell "Marshalled" as Marshaled Sub-task Resolved Steve Loughran  
        62.
        s3a test docs to mention non-auth; or s3a tests to default to non-auth Sub-task Resolved Unassigned  
        63.
        ClassCastException in S3GuardTool.checkMetadataStoreUri Sub-task Resolved Steve Loughran  
        64.
        Regression: TestStagingPartitionedJobCommit failing with empty etag list Sub-task Resolved Steve Loughran  
        65.
        Remove S3A's depedency on http core Sub-task Resolved Steve Loughran  
        66.
        Test Hang in S3A S3guard test MetadataStoreTestBase.testListChildren Sub-task Resolved Unassigned  
        67.
        Stabilize S3A OpenSSL support Sub-task Resolved Sahil Takiar  
        68.
        MapReduce job tasks fails on S3A ssl3_get_server_certificate:certificate verify Sub-task Resolved Steve Loughran  
        69.
        TeraSort Job failing on S3 DirectoryStagingCommitter: destination path exists Sub-task Resolved Steve Loughran  
        70.
        S3A NullPointerException: null uri host. This can be caused by unencoded / in the password string Sub-task Resolved Unassigned  
        71.
        Option to disable GCM for SSL connections when running on Java 8 Sub-task Resolved Sahil Takiar  
        72.
        S3AInputStream#unbuffer should merge input stream stats into fs-wide stats Sub-task Resolved Sahil Takiar  
        73.
        S3A openFile() options to allow etag/version to be set Sub-task Resolved Unassigned  
        74.
        S3A returns 400 "bad request" on a single path within an S3 bucket Sub-task Resolved Unassigned  
        75.
        AbstractITCommitMRJob.testMRJob test failures Sub-task Resolved Unassigned  
        76.
        Downgrade INFO message on rm s3a root dir to DEBUG Sub-task Resolved Unassigned  
        77.
        ITestS3ACommitterFactory failing, S3 client is not inconsistent Sub-task Resolved Steve Loughran  
        78.
        LocatedFileStatusFetcher scans failing intermittently against S3 store Sub-task Resolved Steve Loughran  
        79.
        Typo in s3a committers.md doc Sub-task Resolved Unassigned  
        80.
        Make last AWS credential provider in default auth chain EC2ContainerCredentialsProviderWrapper Sub-task Resolved Steve Loughran  
        81.
        Restore (documented) fs.s3a.SharedInstanceProfileCredentialsProvider Sub-task Resolved Steve Loughran  
        82.
        S3A delegation token tests fail if fs.s3a.encryption.key set Sub-task Resolved Steve Loughran  
        83.
        S3Guard bucket-info fails if the bucket location is denied to the caller Sub-task Resolved Steve Loughran  
        84.
        S3A retry policy to be exponential Sub-task Resolved Steve Loughran  
        85.
        S3ADelegationTokens to only log at debug on startup Sub-task Resolved Steve Loughran  
        86.
        S3A committers leak threads/raises OOM on job/task commit at scale Sub-task Resolved Steve Loughran  
        87.
        s3a attempts to look up password/encryption fail if JCEKS file unreadable Sub-task Resolved Unassigned  
        88.
        S3A ITestRestrictedReadAccess fails Sub-task Resolved Steve Loughran  
        89.
        Speculating & Partitioned S3A magic committers can leave pending files under __magic Sub-task Resolved Steve Loughran  
        90.
        S3A ITest*MRjob failures Sub-task Resolved Siddharth Seth  
        91.
        S3A ITest failures without S3Guard Sub-task Resolved Steve Loughran  
        92.
        S3A innerGetFileStatus s"directories only" scan still does a HEAD Sub-task Resolved Steve Loughran  
        93.
        S3A Delegation Token extension point to use StoreContext Sub-task Resolved Steve Loughran  
        94.
        ITestS3AClosedFS failing -junit test thread Sub-task Resolved Steve Loughran  
        95.
        S3 getBucketLocation() can return "US" for us-east Sub-task Resolved Steve Loughran  
        96.
        S3Guard DDB overreacts to no tag access Sub-task Resolved Gabor Bota  
        97.
        HadoopExecutors cleanup to only log at debug Sub-task Resolved David Mollitor  
        98.
        S3Guard: Make authoritative mode exclusive for metadata - don't check for expiry for authoritative paths Sub-task Resolved Gabor Bota  
        99.
        S3A bucket existence checks to support v2 API and "no checks at all" Sub-task Resolved Mukund Thakur  
        100.
        S3GuardTool to support FilterFileSystem Sub-task Resolved Steve Loughran  
        101.
        s3guard prune can delete directories -leaving orphan children. Sub-task Resolved Steve Loughran  
        102.
        S3Guard to support encrypted DynamoDB table Sub-task Resolved Mingliang Liu  
        103.
        S3A empty dir markers are not created in s3guard as authoritative Sub-task Resolved Steve Loughran  
        104.
        DurationInfo text parsing/formatting should be moved out of hotpath Sub-task Resolved Rajesh Balamohan  
        105.
        Increase timeout unit test rule for MetadataStoreTestBase Sub-task Resolved Mingliang Liu  
        106.
        Refine testing.md to tell user better how to use auth-keys.xml Sub-task Resolved Mingliang Liu  
        107.
        fs.s3a.authoritative.path should support multiple FS URIs Sub-task Resolved Steve Loughran  
        108.
        Filesystem openFile() builder to take a FileStatus param Sub-task Resolved Steve Loughran  
        109.
        S3AInputStream reopening does not handle non IO exceptions properly Sub-task Resolved Sergei Poganshev  
        110.
        Let s3 clients configure request timeout Sub-task Resolved Mustafa Iman  
        111.
        S3Guard listFiles will not query S3 if all listings are authoritative Sub-task Resolved Mustafa Iman  
        112.
        Large DeleteObject requests are their own Thundering Herd Sub-task Resolved Steve Loughran  
        113.
        TestHarFileSystem.testInheritedMethodsImplemented broken Sub-task Resolved Steve Loughran  
        114.
        ITestS3GuardOutOfBandOperations failing on versioned S3 buckets Sub-task Resolved Steve Loughran  
        115.
        S3A reverts KMS encryption to the bucket's default KMS key in rename/copy Sub-task Resolved Mukund Thakur  
        116.
        S3A client retries on SSL Auth exceptions triggered by "." bucket names Sub-task Open Unassigned  

          Activity

            People

            • Assignee:
              stevel@apache.org Steve Loughran
              Reporter:
              stevel@apache.org Steve Loughran
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified