Details

    • Type: Sub-task
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0-beta1
    • Fix Version/s: None
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:

      Description

      The S3A CLI will need to be able to list and delete pending multipart commits.

      We can do the cleanup already via fs.s3a properties. The CLI will let scripts stat for outstanding data (have a different exit code) and permit batch jobs to explicitly trigger cleanups.

      This will become critical with the multipart committer, as there's a significantly higher likelihood of commits remaining outstanding.

      We may also want to be able to enumerate/cancel all pending commits in the FS tree

      1. HADOOP-13974.003.patch
        43 kB
        Aaron Fabbri
      2. HADOOP-13974.002.patch
        41 kB
        Aaron Fabbri
      3. HADOOP-13974.001.patch
        37 kB
        Aaron Fabbri

        Activity

        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 15s Docker mode activated.
              Prechecks
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
              trunk Compile Tests
        +1 mvninstall 13m 49s trunk passed
        +1 compile 0m 22s trunk passed
        +1 checkstyle 0m 15s trunk passed
        +1 mvnsite 0m 26s trunk passed
        +1 shadedclient 9m 15s branch has no errors when building and testing our client artifacts.
        +1 findbugs 0m 31s trunk passed
        +1 javadoc 0m 15s trunk passed
              Patch Compile Tests
        +1 mvninstall 0m 21s the patch passed
        +1 compile 0m 20s the patch passed
        +1 javac 0m 20s the patch passed
        -0 checkstyle 0m 12s hadoop-tools/hadoop-aws: The patch generated 2 new + 5 unchanged - 0 fixed = 7 total (was 5)
        +1 mvnsite 0m 22s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 shadedclient 11m 3s patch has no errors when building and testing our client artifacts.
        -1 findbugs 0m 55s hadoop-tools/hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
        +1 javadoc 0m 16s the patch passed
              Other Tests
        +1 unit 0m 48s hadoop-aws in the patch passed.
        +1 asflicense 0m 22s The patch does not generate ASF License warnings.
        40m 26s



        Reason Tests
        FindBugs module:hadoop-tools/hadoop-aws
          Found reliance on default encoding in org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Uploads.promptBeforeAbort(PrintStream):in org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Uploads.promptBeforeAbort(PrintStream): new java.util.Scanner(InputStream) At S3GuardTool.java:[line 1179]



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:3d04c00
        JIRA Issue HADOOP-13974
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12891796/HADOOP-13974.003.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
        uname Linux 11dab2c2d246 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / e46d5bb
        Default Java 1.8.0_144
        findbugs v3.1.0-RC1
        checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/13500/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
        findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/13500/artifact/patchprocess/new-findbugs-hadoop-tools_hadoop-aws.html
        Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/13500/testReport/
        modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
        Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13500/console
        Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.       trunk Compile Tests +1 mvninstall 13m 49s trunk passed +1 compile 0m 22s trunk passed +1 checkstyle 0m 15s trunk passed +1 mvnsite 0m 26s trunk passed +1 shadedclient 9m 15s branch has no errors when building and testing our client artifacts. +1 findbugs 0m 31s trunk passed +1 javadoc 0m 15s trunk passed       Patch Compile Tests +1 mvninstall 0m 21s the patch passed +1 compile 0m 20s the patch passed +1 javac 0m 20s the patch passed -0 checkstyle 0m 12s hadoop-tools/hadoop-aws: The patch generated 2 new + 5 unchanged - 0 fixed = 7 total (was 5) +1 mvnsite 0m 22s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedclient 11m 3s patch has no errors when building and testing our client artifacts. -1 findbugs 0m 55s hadoop-tools/hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) +1 javadoc 0m 16s the patch passed       Other Tests +1 unit 0m 48s hadoop-aws in the patch passed. +1 asflicense 0m 22s The patch does not generate ASF License warnings. 40m 26s Reason Tests FindBugs module:hadoop-tools/hadoop-aws   Found reliance on default encoding in org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Uploads.promptBeforeAbort(PrintStream):in org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Uploads.promptBeforeAbort(PrintStream): new java.util.Scanner(InputStream) At S3GuardTool.java: [line 1179] Subsystem Report/Notes Docker Image:yetus/hadoop:3d04c00 JIRA Issue HADOOP-13974 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12891796/HADOOP-13974.003.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle uname Linux 11dab2c2d246 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / e46d5bb Default Java 1.8.0_144 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/13500/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/13500/artifact/patchprocess/new-findbugs-hadoop-tools_hadoop-aws.html Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/13500/testReport/ modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13500/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        fabbri Aaron Fabbri added a comment -

        Yes, Steve Loughran happy to focus on getting the main patch merged. Don't mind rebasing this after we get HADOOP-13786 in.

        Show
        fabbri Aaron Fabbri added a comment - Yes, Steve Loughran happy to focus on getting the main patch merged. Don't mind rebasing this after we get HADOOP-13786 in.
        Hide
        stevel@apache.org Steve Loughran added a comment -

        Welcome as it is, I'd like to hold this off until we can get the HADOOP-13786 patch in; at least into a branch we setup for the s3guard stuff.

        Why? Not just because its related, but because the committer patch has the retry logic and wraps all low-level s3 IO through S3AFS possibly via a now extracted WriteOperationsHelper (which doesn't take a key anymore either).

        I don't want to add any more invocations of s3 which aren't resilient to transient failures, throttling etc.

        Show
        stevel@apache.org Steve Loughran added a comment - Welcome as it is, I'd like to hold this off until we can get the HADOOP-13786 patch in; at least into a branch we setup for the s3guard stuff. Why? Not just because its related, but because the committer patch has the retry logic and wraps all low-level s3 IO through S3AFS possibly via a now extracted WriteOperationsHelper (which doesn't take a key anymore either). I don't want to add any more invocations of s3 which aren't resilient to transient failures, throttling etc.
        Hide
        fabbri Aaron Fabbri added a comment -

        v3 patch.

        • Adds "are you sure" prompt any time `-abort` is used. Also adds `-force` option to override that.
        • Update docs.
        • Fixed findbugs / checkstyle issues. I ran test-patch on v2, and was delighted it was clean first try, but apparently something is not working right.
        Show
        fabbri Aaron Fabbri added a comment - v3 patch. Adds "are you sure" prompt any time `-abort` is used. Also adds `-force` option to override that. Update docs. Fixed findbugs / checkstyle issues. I ran test-patch on v2, and was delighted it was clean first try, but apparently something is not working right.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 21s Docker mode activated.
              Prechecks
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
              trunk Compile Tests
        +1 mvninstall 13m 55s trunk passed
        +1 compile 0m 19s trunk passed
        +1 checkstyle 0m 13s trunk passed
        +1 mvnsite 0m 23s trunk passed
        +1 shadedclient 8m 42s branch has no errors when building and testing our client artifacts.
        +1 findbugs 0m 28s trunk passed
        +1 javadoc 0m 14s trunk passed
              Patch Compile Tests
        +1 mvninstall 0m 20s the patch passed
        +1 compile 0m 18s the patch passed
        +1 javac 0m 18s the patch passed
        -0 checkstyle 0m 12s hadoop-tools/hadoop-aws: The patch generated 21 new + 5 unchanged - 0 fixed = 26 total (was 5)
        +1 mvnsite 0m 23s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 shadedclient 9m 26s patch has no errors when building and testing our client artifacts.
        -1 findbugs 0m 42s hadoop-tools/hadoop-aws generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
        -1 javadoc 0m 15s hadoop-tools_hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
              Other Tests
        +1 unit 0m 47s hadoop-aws in the patch passed.
        +1 asflicense 0m 19s The patch does not generate ASF License warnings.
        37m 53s



        Reason Tests
        FindBugs module:hadoop-tools/hadoop-aws
          Unused field:MultipartUtils.java
          Boxing/unboxing to parse a primitive org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Uploads.processArgs(List, PrintStream) At S3GuardTool.java:org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Uploads.processArgs(List, PrintStream) At S3GuardTool.java:[line 1232]



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:3d04c00
        JIRA Issue HADOOP-13974
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12891635/HADOOP-13974.002.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
        uname Linux bbfc6e95bea2 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 13fcfb3
        Default Java 1.8.0_144
        findbugs v3.1.0-RC1
        checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/13496/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
        findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/13496/artifact/patchprocess/new-findbugs-hadoop-tools_hadoop-aws.html
        javadoc https://builds.apache.org/job/PreCommit-HADOOP-Build/13496/artifact/patchprocess/diff-javadoc-javadoc-hadoop-tools_hadoop-aws.txt
        Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/13496/testReport/
        modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
        Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13496/console
        Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.       trunk Compile Tests +1 mvninstall 13m 55s trunk passed +1 compile 0m 19s trunk passed +1 checkstyle 0m 13s trunk passed +1 mvnsite 0m 23s trunk passed +1 shadedclient 8m 42s branch has no errors when building and testing our client artifacts. +1 findbugs 0m 28s trunk passed +1 javadoc 0m 14s trunk passed       Patch Compile Tests +1 mvninstall 0m 20s the patch passed +1 compile 0m 18s the patch passed +1 javac 0m 18s the patch passed -0 checkstyle 0m 12s hadoop-tools/hadoop-aws: The patch generated 21 new + 5 unchanged - 0 fixed = 26 total (was 5) +1 mvnsite 0m 23s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedclient 9m 26s patch has no errors when building and testing our client artifacts. -1 findbugs 0m 42s hadoop-tools/hadoop-aws generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) -1 javadoc 0m 15s hadoop-tools_hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)       Other Tests +1 unit 0m 47s hadoop-aws in the patch passed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 37m 53s Reason Tests FindBugs module:hadoop-tools/hadoop-aws   Unused field:MultipartUtils.java   Boxing/unboxing to parse a primitive org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Uploads.processArgs(List, PrintStream) At S3GuardTool.java:org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Uploads.processArgs(List, PrintStream) At S3GuardTool.java: [line 1232] Subsystem Report/Notes Docker Image:yetus/hadoop:3d04c00 JIRA Issue HADOOP-13974 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12891635/HADOOP-13974.002.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle uname Linux bbfc6e95bea2 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 13fcfb3 Default Java 1.8.0_144 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/13496/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/13496/artifact/patchprocess/new-findbugs-hadoop-tools_hadoop-aws.html javadoc https://builds.apache.org/job/PreCommit-HADOOP-Build/13496/artifact/patchprocess/diff-javadoc-javadoc-hadoop-tools_hadoop-aws.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/13496/testReport/ modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/13496/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        fabbri Aaron Fabbri added a comment -

        v2 patch:

        • Adds "-expect" option as suggested above by Steve Loughran. Also includes new test case for it, as well as using the option in existing test cases.
        • Adds test case for the age based filtering of -list and -delete.

        Still pending: adding an "are you sure" prompt w/ -force override.

        Show
        fabbri Aaron Fabbri added a comment - v2 patch: Adds "-expect" option as suggested above by Steve Loughran . Also includes new test case for it, as well as using the option in existing test cases. Adds test case for the age based filtering of -list and -delete. Still pending: adding an "are you sure" prompt w/ -force override.
        Hide
        fabbri Aaron Fabbri added a comment - - edited

        Attaching v1 patch.

        Some caveats I'd like feedback on:

        • The `hadoop s3guard uploads -abort` command is dangerous, especially if there are MPU commits in flight. I feel like this could use a failsafe "are you sure" prompt. I'm leaning towards implementing that when the specified age is less than one day. Presumably this is "safer" if your filter only matches older upload parts. Or, we could add a "-force" option to override it.
        • Also think a disclaimer about clock synchronization might be warranted in the docs. S3's MPU list gives an "initiated time" which I use here. I'm wondering if this is server side or client side?
        • There is a test gap.. I tested the age-based filtering by hand but don't have an automated test for it. v2 patch should probably have at least a very basic test of the age filtering.
        Show
        fabbri Aaron Fabbri added a comment - - edited Attaching v1 patch. Some caveats I'd like feedback on: The `hadoop s3guard uploads -abort` command is dangerous, especially if there are MPU commits in flight. I feel like this could use a failsafe "are you sure" prompt. I'm leaning towards implementing that when the specified age is less than one day. Presumably this is "safer" if your filter only matches older upload parts. Or, we could add a "-force" option to override it. Also think a disclaimer about clock synchronization might be warranted in the docs. S3's MPU list gives an "initiated time" which I use here. I'm wondering if this is server side or client side? There is a test gap.. I tested the age-based filtering by hand but don't have an automated test for it. v2 patch should probably have at least a very basic test of the age filtering.
        Hide
        stevel@apache.org Steve Loughran added a comment -

        +1 for consistency across commands

        Show
        stevel@apache.org Steve Loughran added a comment - +1 for consistency across commands
        Hide
        fabbri Aaron Fabbri added a comment -

        Thanks Steve Loughran. I was thinking something similar. I'm using the same args for -list and -abort so you can specify age with both and see what you are going to delete with -list before you do -abort.

        On the <age string> part, there is already a way of specifying age for the s3guard prune command. It basically has four possible options -days, -hours, -minutes, -seconds was going to use that for consistency. Sound ok?

        I have multipart upload listing iterators (factored out of S3AFileSystem as much as possible) with integration tests, and am finishing up the CLI tool and integration tests. Should have a v1 patch to post soon.

        Show
        fabbri Aaron Fabbri added a comment - Thanks Steve Loughran . I was thinking something similar. I'm using the same args for -list and -abort so you can specify age with both and see what you are going to delete with -list before you do -abort . On the <age string> part, there is already a way of specifying age for the s3guard prune command. It basically has four possible options -days, -hours, -minutes, -seconds was going to use that for consistency. Sound ok? I have multipart upload listing iterators (factored out of S3AFileSystem as much as possible) with integration tests, and am finishing up the CLI tool and integration tests. Should have a v1 patch to post soon.
        Hide
        stevel@apache.org Steve Loughran added a comment -

        Here's the commands we will initially need

        s3guard uploads -list [-verbose] path
        

        lists pending uploads; verbose includes size of upload if possible, age, anything else. Maybe also allow an error code to be returned if the count does not match some param -expect <count>...for scripts & tests.

        s3guard uploads -cancel (-age <age string>| -all) path
        

        Cancel things of a specific age, use the same age string as we use in Configuration.getTimeDuration. (Which will justify moving the helper code there out into something more broadly useable)

        Show
        stevel@apache.org Steve Loughran added a comment - Here's the commands we will initially need s3guard uploads -list [-verbose] path lists pending uploads; verbose includes size of upload if possible, age, anything else. Maybe also allow an error code to be returned if the count does not match some param -expect <count> ...for scripts & tests. s3guard uploads -cancel (-age <age string>| -all) path Cancel things of a specific age, use the same age string as we use in Configuration.getTimeDuration. (Which will justify moving the helper code there out into something more broadly useable)
        Hide
        fabbri Aaron Fabbri added a comment -

        I'll try to throw something together this week.

        Show
        fabbri Aaron Fabbri added a comment - I'll try to throw something together this week.
        Hide
        Thomas Demoor Thomas Demoor added a comment -

        For purging we use TransferManagers abortMultipartUploads which under the hood does exactly that: listMultipartuploads and then iterate that list and abort 1 by 1.

        Show
        Thomas Demoor Thomas Demoor added a comment - For purging we use TransferManagers abortMultipartUploads which under the hood does exactly that: listMultipartuploads and then iterate that list and abort 1 by 1.
        Hide
        stevel@apache.org Steve Loughran added a comment -

        we can use List Multiparts to list active MPUs, something for tests to pick up as well (i.e. assert that there aren't any active)

        Show
        stevel@apache.org Steve Loughran added a comment - we can use List Multiparts to list active MPUs, something for tests to pick up as well (i.e. assert that there aren't any active)

          People

          • Assignee:
            fabbri Aaron Fabbri
            Reporter:
            stevel@apache.org Steve Loughran
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development