Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-17566

Über-jira: S3A Hadoop 3.4 features

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: In Progress
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.3.1
    • Fix Version/s: None
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:

      Attachments

        Issue Links

        1.
        ITestS3AContractSeek.teardown closes FS before superclass does its cleanup Sub-task Open Unassigned

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1h 10m
        2.
        Add an Audit plugin point for S3A auditing/context Sub-task Resolved Steve Loughran

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 22h 20m
        3.
        S3A to treat "SdkClientException: Data read has a different length than the expected" as EOFException Sub-task Open Unassigned

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 0.5h
        4.
        Add a MkdirOperation for chained S3 operations during mkdir Sub-task In Progress Steve Loughran

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1h 50m
        5.
        Magic committer to downgrade abort in cleanup if list uploads fails with access denied Sub-task Resolved Bogdan Stolojan

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        6.
        Use S3 content-range header to update length of an object during reads Sub-task Open Unassigned  
        7.
        S3Guard import can OOM on large imports Sub-task In Progress Steve Loughran

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 40m
        8.
        S3A ITestPartialRenamesDeletes.testRenameDirFailsInDelete failure: missing directory marker Sub-task Reopened Steve Loughran  
        9.
        fs.s3a.buffer.dir to be under Yarn container path on yarn applications Sub-task Open Unassigned  
        10.
        Support S3 Access Points Sub-task Open Unassigned  
        11.
        Failure of ITestAssumeRole.testRestrictedCommitActions Sub-task Open Steve Loughran  
        12.
        Re-enable optimized copyFromLocal implementation in S3AFileSystem Sub-task Open Unassigned  
        13.
        ITestS3AConfiguration.testProxyConnection failing when s3a bucket probe disabled Sub-task Open Unassigned  
        14.
        S3A AWS Credential provider loading gets confused with isolated classloaders Sub-task Resolved Steve Loughran  
        15.
        S3A (async) ObjectListingIterator to block in hasNext() for results Sub-task Open Steve Loughran

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1.5h
        16.
        S3A deleteObjects hanging/retrying forever Sub-task Open Unassigned  
        17.
        log accepted/rejected fs.s3a.authoritative.path paths @ debug Sub-task Open Unassigned  
        18.
        AWS AssumedRoleCredentialProvider needs ExternalId add Sub-task Open Unassigned  
        19.
        S3A mkdirs to indicate which parent path element refers to a file Sub-task Resolved Unassigned  
        20.
        transient ITestS3AFileContextStatistics failure -read buffer not filled Sub-task Open Unassigned  
        21.
        S3A DT marshalling to include nested error text in wrapped message Sub-task Open Unassigned  
        22.
        ITestDynamoDBMetadataStore.testTableVersioning failure -DDB deleteItem consistency? Sub-task Open Unassigned  
        23.
        ITestCustomSigner uses absolute paths off the bucket root rather than fork-relative Sub-task Open Unassigned  
        24.
        NPE in s3a byte buffer block upload Sub-task Open Unassigned  
        25.
        GCS to support per-bucket configuration Sub-task Open Unassigned  
        26.
        S3A delegation token binding to support secondary binding list Sub-task In Progress Steve Loughran  
        27.
        S3A client retries on SSL Auth exceptions triggered by "." bucket names Sub-task Open Unassigned  
        28.
        S3AInputStream logging to make it easier to debug file leakage Sub-task Open Unassigned  
        29.
        s3a rename failed during copy, "Unable to copy part" + 200 error code Sub-task Open Unassigned  
        30.
        Filesystem discovery to stop loading implementation classes Sub-task Open Unassigned  
        31.
        Remove transient dependency on hadoop-hdfs-client Sub-task Open Unassigned  
        32.
        S3A input stream to support ByteBufferReadable Sub-task Open Unassigned  
        33.
        Encrypt S3A buffered data on disk Sub-task Open Unassigned  
        34.
        S3a: Failed to reset the request input stream/make S3A uploadPart() retriable Sub-task Open Unassigned  
        35.
        Add AWS S3 Transfer acceleration support Sub-task Open Unassigned  
        36.
        Tune hadoop-aws parallel test surefire/failsafe settings Sub-task Open Unassigned  
        37.
        typo in TestNeworkBinding Sub-task Open Steve Loughran  
        38.
        ITestS3AContractSeek teardown closes test FS before superclass can do its cleanup Sub-task Open Unassigned  
        39.
        S3AFileSystem copyFile to propagate etag/version from getObjectMetadata to copy request Sub-task Open Unassigned  
        40.
        Distcp to set S3 Storage Class Sub-task Open Unassigned

        0%

        Original Estimate - 168h
        Remaining Estimate - 168h
        41.
        S3ARetryPolicy to handle AWS 500 responses/error code TooBusyException with the throttle backoff policy Sub-task Open Unassigned  
        42.
        ITestS3AContractGetFileStatusV1List may have consistency issues Sub-task Resolved Unassigned  
        43.
        S3A to support Requester Pays Buckets Sub-task Patch Available Mandus Momberg

        0%

        Original Estimate - 2h
        Remaining Estimate - 2h
        44.
        ITestS3AAWSCredentialsProvider tests fail if a bucket has DTs enabled Sub-task Open Unassigned  
        45.
        S3A DT support to warn when loading expired token Sub-task Open Steve Loughran  
        46.
        S3A can support short user-friendly aliases for configuration of credential providers. Sub-task Open Unassigned  
        47.
        S3AFileStatus to add a serialVersionUID; review & test serialization Sub-task Open Unassigned  
        48.
        S3A to support configuring various AWS S3 client extended options Sub-task Open Unassigned  
        49.
        Report problems w/ local S3A buffer directory meaningfully Sub-task Open Unassigned  
        50.
        Distcp is unable to determine region with S3 PrivateLink endpoints Sub-task Open Unassigned  
        51.
        test and document use of fs.s3a.signing-algorithm Sub-task Open Unassigned  
        52.
        S3A getContentSummary() to move to listFiles(recursive) to count children; instrument use Sub-task Open Unassigned  
        53.
        s3a to set fake directory marker contentType to application/x-directory Sub-task Resolved Steve Loughran  
        54.
        S3A copy/rename of large files to be parallelized as a multipart operation Sub-task Open Unassigned  
        55.
        ITestS3A select tests fail if user kinited in Sub-task Open Unassigned  
        56.
        s3guard uploads command to list date and initiator of outstanding uploads Sub-task Open Unassigned  
        57.
        hadoop-cloud-storage transient dependencies need review Sub-task Open Unassigned  
        58.
        Add common getFileBlockLocations() emulation for object stores, including S3A Sub-task Patch Available Steve Loughran  
        59.
        S3AInputStream.seek should throw EOFException if seeking past the end of file Sub-task Open Unassigned  
        60.
        Add custom InstanceProfileCredentialsProvider with more resilience to throttling Sub-task Open Unassigned  
        61.
        S3A to implement rename(final Path src, final Path dst, final Rename... options) Sub-task Open Unassigned  
        62.
        cherry pick s3 ehancements from PrestoS3FileSystem Sub-task Open Unassigned  
        63.
        Some S3A tests leak filesystem instances Sub-task Open Unassigned  
        64.
        Use lighter-weight alternatives to innerGetFileStatus where possible Sub-task Open Unassigned  
        65.
        Possible inconsistent state of AbstractDelegationTokenSecretManager Sub-task Patch Available Hankó Gergely

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1h 10m
        66.
        Review S3A documentation to make sure it is consistent with the current codebase Sub-task Open Unassigned  
        67.
        strip s3.amazonaws.com off hostnames before making s3a calls Sub-task Open Unassigned  
        68.
        S3A Filesystem does not check return from AmazonS3Client deleteObjects Sub-task Open Unassigned  
        69.
        s3a to improve diags on s3a bad request message Sub-task Open Unassigned  
        70.
        S3 Select Exceptions are not being converted to IOEs Sub-task Open Unassigned  
        71.
        Improve isolation of FS instances in S3A committer tests Sub-task Open Unassigned  
        72.
        ITestS3ARemoteFileChanged doesn't overwrite test data creation Sub-task Open Unassigned  
        73.
        Handle S3A "glacier" data Sub-task Open Unassigned  
        74.
        AbstractContractDistCpTest to test attr preservation with -p, verify blobstores downgrade Sub-task Open Steve Loughran  
        75.
        Add S3AWriteOpContext for write ops; pass in statistics and other settings Sub-task Open Unassigned  
        76.
        S3AFilesystem trash handling should respect the current UGI Sub-task Open Unassigned  
        77.
        Impersonate hosts in s3a for better data locality handling Sub-task Open Thomas Demoor  
        78.
        test YARN log collection works to s3a Sub-task Open Unassigned  
        79.
        s3a rm on the CLI generates deprecation warning on io.bytes.per.checksum Sub-task Open Unassigned  
        80.
        review S3A translateException translation matches IBM CORS spec Sub-task Open Unassigned  
        81.
        NPE in S3AInputStream.read() in ITestS3AInconsistency.testOpenFailOnRead Sub-task Open Unassigned  
        82.
        S3A add histogram metrics types for latency, etc. Sub-task Resolved Sean Mackrory  
        83.
        Add S3A support for Async Scatter/Gather IO Sub-task Open Gabor Bota  
        84.
        S3AInputStream read(bytes[]) to not retry on read failure: pass action up Sub-task Open Unassigned  
        85.
        ITestS3AContractRootDir failure on non-S3Guarded bucket Sub-task Open Unassigned  
        86.
        s3a new getdefaultblocksize be called in getFileStatus which has not been implemented in s3afilesystem yet Sub-task Open Unassigned  
        87.
        s3guard to provide better diags on ddb init failures Sub-task Open Unassigned  
        88.
        S3A Secret access to fall back to XML if credential provider raises IOE. Sub-task Open Unassigned  
        89.
        Add a way for an FS instance to say "really, no trash interval at all" Sub-task Open Unassigned  
        90.
        add a special 0 byte input stream for empty blobs Sub-task Open Unassigned  
        91.
        S3A FS to add "s3a:no-existence-checks" to the builder file creation option set Sub-task Open Unassigned  
        92.
        S3 SSEC tests to downgrade when running against a mandatory encryption object store Sub-task Open Unassigned  
        93.
        S3A to use a thread pool for async path operations Sub-task Open Unassigned  
        94.
        s3a doesn't consider blobs with trailing / and content-length >0 as directories Sub-task Open Unassigned  
        95.
        s3guard bucket-info command to include default bucket encryption info Sub-task Open Unassigned  
        96.
        clean up ITestS3AFileSystemContract Sub-task Patch Available Unassigned  
        97.
        Encrypt S3A data client-side with AWS SDK (S3-CSE) Sub-task Patch Available Igor Mazur

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 3h 20m
        98.
        S3a auth exception to link to a wiki page on the problem Sub-task Open Unassigned  
        99.
        Add some S3A-specific create file options Sub-task Open Unassigned  
        100.
        S3A: Set thread names with more specific information about the call. Sub-task Open Unassigned  
        101.
        shell rm command to not rename to ~/.Trash in object stores Sub-task Open Unassigned  
        102.
        Test hadoop fs shell against s3a; fix problems Sub-task Open Unassigned  
        103.
        Document `dynamodb:TagResource` an explicit client-side permission for S3Guard Sub-task Open Gabor Bota  
        104.
        builld up md5 checksum as blocks are built in S3ABlockOutputStream; validate upload Sub-task Open Unassigned  
        105.
        increase the default number of threads and http connections in S3A Sub-task Open Unassigned  
        106.
        support git-secrets commit hook to keep AWS secrets out of git Sub-task Patch Available Steve Loughran  
        107.
        multipart/huge file upload tests to look at checksums returned Sub-task Open Unassigned  
        108.
        declare that fs.s3a.ext. is a prefix for arbitrary extensions Sub-task Open Unassigned  
        109.
        ITestS3AMiniYarnCluster fails on sequential runs with Kerberos error Sub-task Open Unassigned  
        110.
        S3a DelegationToken bindings to to support a "correlation ID" for the UA header Sub-task Open Unassigned  
        111.
        S3AInputStream.skip() to use lazy seek Sub-task Open Unassigned  
        112.
        improve setting of max connections in AWS client Sub-task Open Unassigned  
        113.
        FileSystem/s3a processDeleteOnExit to skip the exists() check Sub-task Open Unassigned  
        114.
        Support AWS S3 reduced redundancy storage class Sub-task Open Unassigned  
        115.
        Speed up S3A test runs Sub-task Open Unassigned  
        116.
        remove misleading fs.s3a.delegation.tokens.enabled prompt Sub-task Open Unassigned  
        117.
        Clarify committers.md around v2 failure handling Sub-task Open Unassigned  
        118.
        make s3a read fault injection configurable including "off" Sub-task Open Unassigned  
        119.
        S3a operations keep retrying if the password is wrong Sub-task Open Thomas Poepping  
        120.
        AWS Data read stack trace in S3a putObjectDirect Sub-task Open Unassigned  
        121.
        S3aDelegationTokens to add accessor for tests to get at the token binding Sub-task Open Unassigned  
        122.
        s3guard bucket-info command to add a verify-property <key>=<value> <bucket> Sub-task Open Unassigned  
        123.
        Test MR split optimisation with recursive listing Sub-task Open Unassigned  
        124.
        increase performance of s3guard import command Sub-task Resolved Unassigned  
        125.
        Support multipart download in S3AFileSystem Sub-task Open Unassigned  
        126.
        S3A to add option fs.s3a.endpoint.region to set AWS region Sub-task Resolved Mehakmeet Singh

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 3h
        127.
        Upgrade aws-java-sdk to 1.11.993 or later Sub-task Resolved Steve Loughran

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1h
        128.
        S3A NetworkBinding has a runtime class dependency on a third-party shaded class Sub-task Resolved Steve Loughran

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2h
        129.
        compatibility table in directory_markers.md doesn't render right Sub-task Open Steve Loughran  
        130.
        s3a listing IOStatistics to count #of entries returned per LIST call Sub-task Open Unassigned  

          Activity

            People

            • Assignee:
              stevel@apache.org Steve Loughran
              Reporter:
              stevel@apache.org Steve Loughran
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 170h Original Estimate - 170h
                170h
                Remaining:
                Time Spent - 38h 50m Remaining Estimate - 170h
                170h
                Logged:
                Time Spent - 38h 50m Remaining Estimate - 170h
                38h 50m