Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-19353

Über-jira: S3A Hadoop 3.4.2 features

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.4.1
    • None
    • fs/s3
    • None

    Description

      Über-jira for stuff we want into 3.4.2 for s3a connector

      Attachments

        Issue Links

          1.
          Support multipart download in S3AFileSystem Sub-task Open Unassigned  
          2.
          S3A connector to improve support for all AWS partitions Sub-task Open Unassigned  
          3.
          S3AFilesystem trash handling should respect the current UGI Sub-task Open Unassigned  
          4.
          Some S3A tests leak filesystem instances Sub-task Open Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 0.5h
          5.
          S3A AssumedRole credentials provider should use Instance Role credentials in chain for assuming role Sub-task Open Unassigned  
          6.
          multipart/huge file upload tests to look at checksums returned Sub-task Open Unassigned  
          7.
          ITestS3A select tests fail if user kinited in Sub-task Open Unassigned  
          8.
          S3AInputStream.seek should throw EOFException if seeking past the end of file Sub-task Open Unassigned  
          9.
          s3guard bucket-info command to add a verify-property <key>=<value> <bucket> Sub-task Open Unassigned  
          10.
          S3a DelegationToken bindings to to support a "correlation ID" for the UA header Sub-task Open Unassigned  
          11.
          S3a operations keep retrying if the password is wrong Sub-task Open Thomas Poepping  
          12.
          ITestS3AContractSeek teardown closes test FS before superclass can do its cleanup Sub-task Open Unassigned  
          13.
          Impersonate hosts in s3a for better data locality handling Sub-task Open Thomas Demoor  
          14.
          S3A: Support S3 Conditional Writes Sub-task Open Unassigned  
          15.
          Understand status of S3 access point alias support in S3A Sub-task Open Unassigned  
          16.
          Use S3 content-range header to update length of an object during reads Sub-task Open Monthon Klongklaew

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 50m
          17.
          Speed up S3A test runs Sub-task Open Unassigned  
          18.
          S3A Secret access to fall back to XML if credential provider raises IOE. Sub-task Open Unassigned  
          19.
          Add common getFileBlockLocations() emulation for object stores, including S3A Sub-task Patch Available Steve Loughran  
          20.
          S3A deleteObjects hanging/retrying forever Sub-task Open Unassigned  
          21.
          shell rm command to not rename to ~/.Trash in object stores Sub-task Open Unassigned  
          22.
          Clarify committers.md around v2 failure handling Sub-task Open Unassigned  
          23.
          ITestS3ARemoteFileChanged doesn't overwrite test data creation Sub-task Open Unassigned  
          24.
          S3A: Test failures with CSE enabled Sub-task Resolved Ahmar Suhail  
          25.
          Add s3a tool to convert S3 server logs to avro/csv files Sub-task In Progress Steve Loughran  
          26.
          S3AInputStream logging to make it easier to debug file leakage Sub-task Open Unassigned  
          27.
          Add new store vendor config option Sub-task Open Unassigned  
          28.
          S3A: Set thread names with more specific information about the call. Sub-task Open Unassigned  
          29.
          S3A can support short user-friendly aliases for configuration of credential providers. Sub-task Open Unassigned  
          30.
          s3guard uploads command to list date and initiator of outstanding uploads Sub-task Open Unassigned  
          31.
          TestS3AGetFileStatus:testNotFound() to use intercept() Sub-task Open Unassigned  
          32.
          log accepted/rejected fs.s3a.authoritative.path paths @ debug Sub-task Open Unassigned  
          33.
          cherry pick s3 ehancements from PrestoS3FileSystem Sub-task Open Unassigned  
          34.
          S3A: support custom S3 and STS headers Sub-task Open Prerak Pradhan  
          35.
          GCS to support per-bucket configuration Sub-task Open Unassigned  
          36.
          Support AWS IAM Identity Centre (prev. AWS SSO) for providing credentials to S3A Sub-task Open Unassigned  
          37.
          strip s3.amazonaws.com off hostnames before making s3a calls Sub-task Open Unassigned  
          38.
          S3A DeleteOperation to parallelize POSTing of bulk deletes Sub-task Open Unassigned  
          39.
          S3AInputStream.remainingInFile should use nextReadPos Sub-task Reopened lqjacklee  
          40.
          ITestS3AMiniYarnCluster fails on sequential runs with Kerberos error Sub-task Open Unassigned  
          41.
          Add a way for an FS instance to say "really, no trash interval at all" Sub-task Open Unassigned  
          42.
          improve s3a committer stats collected Sub-task Open Unassigned  
          43.
          S3A: add option to disable probe for dir marker recreation on delete/rename. Sub-task Open Harshit Gupta  
          44.
          test and document use of fs.s3a.signing-algorithm Sub-task Open Unassigned  
          45.
          Review S3A documentation to make sure it is consistent with the current codebase Sub-task Open Unassigned  
          46.
          ITestS3AAWSCredentialsProvider tests fail if a bucket has DTs enabled Sub-task Open Unassigned  
          47.
          S3a auth exception to link to a wiki page on the problem Sub-task Open Unassigned  
          48.
          review S3A translateException translation matches IBM CORS spec Sub-task Open Unassigned  
          49.
          S3A input stream to support ByteBufferReadable Sub-task Open Unassigned  
          50.
          S3A Authentication to support WebIdentity Sub-task Open Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h 20m
          51.
          S3A to implement rename(final Path src, final Path dst, final Rename... options) Sub-task Open Unassigned  
          52.
          S3ARetryPolicy to handle AWS 500 responses/error code TooBusyException with the throttle backoff policy Sub-task Open Unassigned  
          53.
          S3A Xattr headers need hdfs-compatible prefix Sub-task Open Unassigned  
          54.
          Amazon S3 disabling ACLs on all new buckets Sub-task Open Unassigned  
          55.
          Encrypt S3A buffered data on disk Sub-task Open Unassigned  
          56.
          S3 Select Exceptions are not being converted to IOEs Sub-task Open Unassigned  
          57.
          S3AFileStatus to add a serialVersionUID; review & test serialization Sub-task Open Unassigned  
          58.
          Add S3AWriteOpContext for write ops; pass in statistics and other settings Sub-task Open Unassigned  
          59.
          S3A DT support to warn when loading expired token Sub-task Open Steve Loughran  
          60.
          S3A (async) ObjectListingIterator to block in hasNext() for results Sub-task Open Steve Loughran

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 40m
          61.
          hadoop-aws tests to take a configurable subdir in the test bucket Sub-task Open Unassigned  
          62.
          S3A DT marshalling to include nested error text in wrapped message Sub-task Open Unassigned  
          63.
          clean up ITestS3AFileSystemContract Sub-task Patch Available Unassigned  
          64.
          S3A Xattr/getXAttr to handle directories without markers Sub-task Open Unassigned  
          65.
          Add custom InstanceProfileCredentialsProvider with more resilience to throttling Sub-task Open Unassigned  
          66.
          Support Overwrite Directory On Commit For S3A Committers Sub-task Open Syed Shameerur Rahman  
          67.
          s3 and abfs incremental listing: use SAX parsers to stream results to list iterators Sub-task Open Unassigned  
          68.
          make s3a read fault injection configurable including "off" Sub-task Open Unassigned  
          69.
          S3A: fs.s3a.connection.request.timeout too low for large uploads over slow links Sub-task Reopened Steve Loughran  
          70.
          remove/deprecate fs.s3a.multipart.purge Sub-task Open Unassigned  
          71.
          s3guard bucket-info command to include default bucket encryption info Sub-task Open Unassigned  
          72.
          Optimise S3A’s recursive delete to drop successful S3 keys on retry of S3 DeleteObjects Sub-task Open Unassigned  
          73.
          S3A client retries on SSL Auth exceptions triggered by "." bucket names Sub-task Open Unassigned  
          74.
          S3A: Allow SSE configurations per object path Sub-task Open Unassigned  
          75.
          Test MR split optimisation with recursive listing Sub-task Open Unassigned  
          76.
          s3a rm on the CLI generates deprecation warning on io.bytes.per.checksum Sub-task Open Unassigned  
          77.
          ITestS3AFileSystemStatistic failure on mvn verify Sub-task Open Unassigned  
          78.
          add a special 0 byte input stream for empty blobs Sub-task Open Unassigned  
          79.
          S3A to support configuring various AWS S3 client extended options Sub-task Open Unassigned  
          80.
          FileSystem/s3a processDeleteOnExit to skip the exists() check Sub-task Open Unassigned  
          81.
          ITestS3AInputStreamPerformance#testDecompressionSequential128K NPE if no CSV file available Sub-task Open Unassigned  
          82.
          s3a to improve diags on s3a bad request message Sub-task Open Unassigned  
          83.
          Handle S3A "glacier" data Sub-task Open Bhavay Pahuja  
          84.
          remove filtering of directory markers in s3a RenameOperation Sub-task Open Unassigned  
          85.
          Filesystem discovery to stop loading implementation classes Sub-task Open Unassigned  
          86.
          s3a new getdefaultblocksize be called in getFileStatus which has not been implemented in s3afilesystem yet Sub-task Open Unassigned  
          87.
          s3a listing IOStatistics to count #of entries returned per LIST call Sub-task Open Unassigned  
          88.
          ITestS3AConfiguration.testProxyConnection failing when s3a bucket probe disabled Sub-task Open Unassigned  
          89.
          ITestS3ABlockOutputArray failure with IO File name too long Sub-task Open Unassigned  
          90.
          Add a way to get the IOStatistics of active filesystems in long-lived processes Sub-task Open Unassigned  
          91.
          NPE in S3AInputStream.read() in ITestS3AInconsistency.testOpenFailOnRead Sub-task Open Unassigned  
          92.
          Public dataset class for S3A integration tests Sub-task Open Daniel Carl Jones  
          93.
          S3AInputStream read(bytes[]) to not retry on read failure: pass action up Sub-task Open Ahmar Suhail

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          94.
          Report problems w/ local S3A buffer directory meaningfully Sub-task Open Unassigned  
          95.
          test YARN log collection works to s3a Sub-task Open Unassigned  
          96.
          AbstractContractDistCpTest to test attr preservation with -p, verify blobstores downgrade Sub-task Open Steve Loughran  
          97.
          Add "versions" tool to s3a command line entry point Sub-task Open Unassigned  
          98.
          Tune hadoop-aws parallel test surefire/failsafe settings Sub-task Open Unassigned  
          99.
          builld up md5 checksum as blocks are built in S3ABlockOutputStream; validate upload Sub-task Open Unassigned  
          100.
          S3aDelegationTokens to add accessor for tests to get at the token binding Sub-task Open Unassigned  
          101.
          S3A: ITestS3AFileContextURI: MultiObjectDeleteException bulk delete of odd filenames Sub-task Open Unassigned  
          102.
          ITestS3ACopyFromLocalFile: AuditFailureException Sub-task Open Unassigned  
          103.
          s3a client SSLException is raised after very long timeout "Unsupported or unrecognized SSL message" Sub-task Open Unassigned  
          104.
          S3A doesn't calculate Content-MD5 on uploads Sub-task Open Unassigned  
          105.
          Warn when no region is configured Sub-task Open Unassigned  
          106.
          Test hadoop fs shell against s3a; fix problems Sub-task Open Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h
          107.
          Use lighter-weight alternatives to innerGetFileStatus where possible Sub-task Open Unassigned  
          108.
          support git-secrets commit hook to keep AWS secrets out of git Sub-task Patch Available Steve Loughran  
          109.
          increase the default number of threads and http connections in S3A Sub-task Open Unassigned  
          110.
          S3AInputStream.skip() to use lazy seek Sub-task Open Ahmar Suhail

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4h 20m
          111.
          Possible inconsistent state of AbstractDelegationTokenSecretManager Sub-task Patch Available Hankó Gergely

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 10m
          112.
          S3A Filesystem does not check return from AmazonS3Client deleteObjects Sub-task Open Unassigned  
          113.
          define s3a encryption behaviour on copy Sub-task Open Unassigned  
          114.
          Remove fs.s3a.executor.capacity Sub-task Open Unassigned  
          115.
          ITestCustomSigner uses absolute paths off the bucket root rather than fork-relative Sub-task Open Unassigned  
          116.
          AWS AssumedRoleCredentialProvider needs ExternalId add Sub-task Open Unassigned  
          117.
          Remove transient dependency on hadoop-hdfs-client Sub-task Open Unassigned  
          118.
          Add AWS S3 Transfer acceleration support Sub-task Open Unassigned  
          119.
          S3A openFile() options to allow etag/version to be set Sub-task Reopened Unassigned  
          120.
          S3 Express: document use Sub-task Open Unassigned  
          121.
          S3A third party: document "Certificate doesn't match" Sub-task Open Unassigned  
          122.
          S3A: handle alternative forms of connection failure Sub-task Open Unassigned  
          123.
          NoSuchMethodError in aws sdk third party logger in hadoop aws 3.4 Sub-task Open Unassigned  
          124.
          S3A: IAMCredentialsProvider throttling results in AWS auth failures Sub-task Open Unassigned  
          125.
          S3A: ITestCustomSigner failing against S3Express Buckets Sub-task Open Unassigned  
          126.
          S3A: transfer manager not wired up to s3a executor pool Sub-task Open Unassigned  
          127.
          Unknown S3Express bucket raises UnknownHostException rather than NoSuchBucketException; will block for retries Sub-task Open Unassigned  
          128.
          remove head bucket request from calls which can be made without audit header Sub-task Open Unassigned  
          129.
          S3A Assume role tests failing against S3Express stores Sub-task Open Unassigned  
          130.
          S3A: Support dynamic region resolution Sub-task Open Steve Loughran  
          131.
          S3A HeaderProcessing to process all metadata entries of HEAD response Sub-task Open Unassigned  
          132.
          S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure Sub-task Open Unassigned  
          133.
          s3a file rename does double HEAD or LIST on source file/dir Sub-task Open Unassigned  
          134.
          S3A: retry on credential expiry Sub-task Open Unassigned  
          135.
          S3A: TestIAMInstanceCredentialsProvider.testIAMInstanceCredentialsInstantiate failure Sub-task Open Unassigned  
          136.
          AWS SDK V2 - Add socket factory to Netty Client Sub-task Open Unassigned  
          137.
          AWS SDK V2 - Move to S3 Java async client Sub-task Open Ahmar Suhail  
          138.
          S3A: detect and recover from SSL ConnectionReset exceptions Sub-task Open Unassigned  
          139.
          S3A: S3A: ITestS3AConfiguration failing with region problems Sub-task Open Unassigned  
          140.
          S3A: remove @deprecated tags where no longer needed Sub-task Open Unassigned  
          141.
          AWS SDK V2 - Refactor getS3Region & other follow up items Sub-task Open Unassigned  
          142.
          coalesce AWS S3 client proxy settings Sub-task Open Unassigned  
          143.
          S3A: Add LeakReporter; use in S3AInputStream Sub-task Resolved Steve Loughran  
          144.
          Move LeakReporter to utils, use more Sub-task Open Unassigned  
          145.
          Support S3A cross region access when S3 region/endpoint is set Sub-task Resolved Syed Shameerur Rahman  
          146.
          S3A: Add config option to skip test with performance mode Sub-task Open Chung En Lee  
          147.
          S3A: terasort tests fail with CSE-kMS enabled and london region With Delegation Token Secrets Sub-task Open Syed Shameerur Rahman  

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 14.5h
                  14.5h