Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10397

Distcp should ignore -delete option if -diff option is provided instead of exiting

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: distcp
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      In distcp, -delete and -diff options are mutually exclusive. HDFS-8828 brought strictly checking which makes the existing applications (or scripts) that work just fine with both -delete and -diff options previously stop performing because of the java.lang.IllegalArgumentException: Diff is valid only with update options exception.

      To make it backward incompatible, we can ignore the -delete option, given -diff option, instead of exiting the program. Along with that, we can print a warning message saying that Diff is valid only with update options, and -delete option is ignored.

      1. HDFS-10397.003.patch
        9 kB
        Mingliang Liu
      2. HDFS-10397.003.patch
        9 kB
        Mingliang Liu
      3. HDFS-10397.002.patch
        8 kB
        Mingliang Liu
      4. HDFS-10397.001.patch
        6 kB
        Mingliang Liu
      5. HDFS-10397.000.patch
        3 kB
        Mingliang Liu

        Issue Links

          Activity

          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #9809 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9809/)
          HDFS-10397. Distcp should ignore -delete option if -diff option is (jing9: rev 03788d3015c962eac1a35fa5df39356e8b84731c)

          • hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptions.java
          • hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java
          • hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestOptionsParser.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #9809 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9809/ ) HDFS-10397 . Distcp should ignore -delete option if -diff option is (jing9: rev 03788d3015c962eac1a35fa5df39356e8b84731c) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptions.java hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestOptionsParser.java
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks Jing Zhao and Yongjun Zhang for the review and discussion!

          Show
          liuml07 Mingliang Liu added a comment - Thanks Jing Zhao and Yongjun Zhang for the review and discussion!
          Hide
          jingzhao Jing Zhao added a comment -

          I've committed this to trunk, branch-2 and branch-2.8. Thanks Mingliang Liu for the contribution. Thanks Yongjun Zhang for the review.

          Show
          jingzhao Jing Zhao added a comment - I've committed this to trunk, branch-2 and branch-2.8. Thanks Mingliang Liu for the contribution. Thanks Yongjun Zhang for the review.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Sorry for the delayed response Mingliang Liu, +1 on rev 003.

          Show
          yzhangal Yongjun Zhang added a comment - Sorry for the delayed response Mingliang Liu , +1 on rev 003.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 10s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 22s trunk passed
          +1 compile 0m 15s trunk passed
          +1 checkstyle 0m 14s trunk passed
          +1 mvnsite 0m 19s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 24s trunk passed
          +1 javadoc 0m 12s trunk passed
          +1 mvninstall 0m 15s the patch passed
          +1 compile 0m 12s the patch passed
          +1 javac 0m 12s the patch passed
          +1 checkstyle 0m 10s hadoop-tools/hadoop-distcp: patch generated 0 new + 76 unchanged - 11 fixed = 76 total (was 87)
          +1 mvnsite 0m 16s the patch passed
          +1 mvneclipse 0m 8s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 0m 31s the patch passed
          +1 javadoc 0m 12s the patch passed
          +1 unit 9m 5s hadoop-distcp in the patch passed.
          +1 asflicense 0m 15s Patch does not generate ASF License warnings.
          19m 51s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:2c91fd8
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804477/HDFS-10397.003.patch
          JIRA Issue HDFS-10397
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 2bed8f9571b2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 34fddd1
          Default Java 1.8.0_91
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15462/testReport/
          modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15462/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 10s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 22s trunk passed +1 compile 0m 15s trunk passed +1 checkstyle 0m 14s trunk passed +1 mvnsite 0m 19s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 24s trunk passed +1 javadoc 0m 12s trunk passed +1 mvninstall 0m 15s the patch passed +1 compile 0m 12s the patch passed +1 javac 0m 12s the patch passed +1 checkstyle 0m 10s hadoop-tools/hadoop-distcp: patch generated 0 new + 76 unchanged - 11 fixed = 76 total (was 87) +1 mvnsite 0m 16s the patch passed +1 mvneclipse 0m 8s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 0m 31s the patch passed +1 javadoc 0m 12s the patch passed +1 unit 9m 5s hadoop-distcp in the patch passed. +1 asflicense 0m 15s Patch does not generate ASF License warnings. 19m 51s Subsystem Report/Notes Docker Image:yetus/hadoop:2c91fd8 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804477/HDFS-10397.003.patch JIRA Issue HDFS-10397 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 2bed8f9571b2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 34fddd1 Default Java 1.8.0_91 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15462/testReport/ modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15462/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          liuml07 Mingliang Liu added a comment -

          Uploading the same patch for pre-commit Jenkins run.

          Show
          liuml07 Mingliang Liu added a comment - Uploading the same patch for pre-commit Jenkins run.
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks Jing Zhao for the review. The latest build and test failures are caused by some test env misconfiguration when switching to Java 8 (see HADOOP-11858). Let's trigger Jenkins after that is fixed.

          Show
          liuml07 Mingliang Liu added a comment - Thanks Jing Zhao for the review. The latest build and test failures are caused by some test env misconfiguration when switching to Java 8 (see HADOOP-11858 ). Let's trigger Jenkins after that is fixed.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          -1 mvninstall 0m 8s root in trunk failed.
          +1 compile 0m 13s trunk passed with JDK v1.8.0_91
          -1 compile 0m 9s hadoop-distcp in trunk failed with JDK v1.7.0_95.
          +1 checkstyle 0m 12s trunk passed
          -1 mvnsite 0m 9s hadoop-distcp in trunk failed.
          +1 mvneclipse 0m 13s trunk passed
          -1 findbugs 0m 9s hadoop-distcp in trunk failed.
          +1 javadoc 0m 10s trunk passed with JDK v1.8.0_91
          +1 javadoc 0m 12s trunk passed with JDK v1.7.0_95
          -1 mvninstall 0m 9s hadoop-distcp in the patch failed.
          +1 compile 0m 13s the patch passed with JDK v1.8.0_91
          +1 javac 0m 13s the patch passed
          -1 compile 0m 8s hadoop-distcp in the patch failed with JDK v1.7.0_95.
          -1 javac 0m 8s hadoop-distcp in the patch failed with JDK v1.7.0_95.
          +1 checkstyle 0m 13s hadoop-tools/hadoop-distcp: patch generated 0 new + 76 unchanged - 11 fixed = 76 total (was 87)
          -1 mvnsite 0m 10s hadoop-distcp in the patch failed.
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          -1 findbugs 0m 8s hadoop-distcp in the patch failed.
          +1 javadoc 0m 10s the patch passed with JDK v1.8.0_91
          +1 javadoc 0m 12s the patch passed with JDK v1.7.0_95
          +1 unit 9m 4s hadoop-distcp in the patch passed with JDK v1.8.0_91.
          -1 unit 0m 9s hadoop-distcp in the patch failed with JDK v1.7.0_95.
          +1 asflicense 0m 18s Patch does not generate ASF License warnings.
          13m 39s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:cf2ee45
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804277/HDFS-10397.003.patch
          JIRA Issue HDFS-10397
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux a30829ef71b4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 4b55642
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          mvninstall https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/branch-mvninstall-root.txt
          compile https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/branch-compile-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt
          mvnsite https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/branch-mvnsite-hadoop-tools_hadoop-distcp.txt
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/branch-findbugs-hadoop-tools_hadoop-distcp.txt
          mvninstall https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-mvninstall-hadoop-tools_hadoop-distcp.txt
          compile https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-compile-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt
          javac https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-compile-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt
          mvnsite https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-mvnsite-hadoop-tools_hadoop-distcp.txt
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-findbugs-hadoop-tools_hadoop-distcp.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15452/testReport/
          modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15452/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. -1 mvninstall 0m 8s root in trunk failed. +1 compile 0m 13s trunk passed with JDK v1.8.0_91 -1 compile 0m 9s hadoop-distcp in trunk failed with JDK v1.7.0_95. +1 checkstyle 0m 12s trunk passed -1 mvnsite 0m 9s hadoop-distcp in trunk failed. +1 mvneclipse 0m 13s trunk passed -1 findbugs 0m 9s hadoop-distcp in trunk failed. +1 javadoc 0m 10s trunk passed with JDK v1.8.0_91 +1 javadoc 0m 12s trunk passed with JDK v1.7.0_95 -1 mvninstall 0m 9s hadoop-distcp in the patch failed. +1 compile 0m 13s the patch passed with JDK v1.8.0_91 +1 javac 0m 13s the patch passed -1 compile 0m 8s hadoop-distcp in the patch failed with JDK v1.7.0_95. -1 javac 0m 8s hadoop-distcp in the patch failed with JDK v1.7.0_95. +1 checkstyle 0m 13s hadoop-tools/hadoop-distcp: patch generated 0 new + 76 unchanged - 11 fixed = 76 total (was 87) -1 mvnsite 0m 10s hadoop-distcp in the patch failed. +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. -1 findbugs 0m 8s hadoop-distcp in the patch failed. +1 javadoc 0m 10s the patch passed with JDK v1.8.0_91 +1 javadoc 0m 12s the patch passed with JDK v1.7.0_95 +1 unit 9m 4s hadoop-distcp in the patch passed with JDK v1.8.0_91. -1 unit 0m 9s hadoop-distcp in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 18s Patch does not generate ASF License warnings. 13m 39s Subsystem Report/Notes Docker Image:yetus/hadoop:cf2ee45 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804277/HDFS-10397.003.patch JIRA Issue HDFS-10397 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux a30829ef71b4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 4b55642 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 mvninstall https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/branch-mvninstall-root.txt compile https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/branch-compile-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt mvnsite https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/branch-mvnsite-hadoop-tools_hadoop-distcp.txt findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/branch-findbugs-hadoop-tools_hadoop-distcp.txt mvninstall https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-mvninstall-hadoop-tools_hadoop-distcp.txt compile https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-compile-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt javac https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-compile-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt mvnsite https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-mvnsite-hadoop-tools_hadoop-distcp.txt findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-findbugs-hadoop-tools_hadoop-distcp.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15452/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15452/testReport/ modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15452/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          jingzhao Jing Zhao added a comment -

          +1 on the 003 patch. I will hold the commit to see if Yongjun Zhang has further comments.

          Show
          jingzhao Jing Zhao added a comment - +1 on the 003 patch. I will hold the commit to see if Yongjun Zhang has further comments.
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks Jing Zhao for the review.

          It makes sense to move the ignoring -deleteMissing logic ahead of its validation. If we ignore it, i.e. by resetting its value, any exception that is caused by its old value should be bypassed.

          As to the test, it's kinda not ideal. Basically the -diff will only work with -update option, and -delete needs either -update or -overwrite option. So if a valid -diff option makes -delete ignored, there should be an -update option along with it. In this case, the -delete option will not cause exception as -update is there. As the best effort I can tell, we can test the case where only -delete and -update options are provided. The -delete option will be ignored and it cannot cause any validation exception. But the validation still fails as the -diff option is not happy w/o an -update option.

          See the v3 patch HDFS-10397.003.patch.

          Show
          liuml07 Mingliang Liu added a comment - Thanks Jing Zhao for the review. It makes sense to move the ignoring -deleteMissing logic ahead of its validation. If we ignore it, i.e. by resetting its value, any exception that is caused by its old value should be bypassed. As to the test, it's kinda not ideal. Basically the -diff will only work with -update option, and -delete needs either -update or -overwrite option. So if a valid -diff option makes -delete ignored, there should be an -update option along with it. In this case, the -delete option will not cause exception as -update is there. As the best effort I can tell, we can test the case where only -delete and -update options are provided. The -delete option will be ignored and it cannot cause any validation exception. But the validation still fails as the -diff option is not happy w/o an -update option. See the v3 patch HDFS-10397.003.patch .
          Hide
          jingzhao Jing Zhao added a comment -

          Thanks for updating the patch, Mingliang. The current patch looks good to me in general. One minor:

          580	    if (useDiff && deleteMissing) {
          581	      // -delete and -diff are mutually exclusive. For backward compatibility,
          582	      // we ignore the -delete option here, instead of throwing an
          583	      // IllegalArgumentException. See HDFS-10397 for more discussion.
          584	      OptionsParser.LOG.warn("-delete and -diff are mutually exclusive. " +
          585	          "The -delete option will be ignored.");
          586	      setDeleteMissing(false);
          587	    }
          

          This logic should be moved to the beginning of the valid method. I.e., the "-deleteMissing" should be bypassed before it causes any exception along with some other option settings. A new unit test will also be helpful for this scenario.

          +1 after addressing the comment.

          Show
          jingzhao Jing Zhao added a comment - Thanks for updating the patch, Mingliang. The current patch looks good to me in general. One minor: 580 if (useDiff && deleteMissing) { 581 // -delete and -diff are mutually exclusive. For backward compatibility, 582 // we ignore the -delete option here, instead of throwing an 583 // IllegalArgumentException. See HDFS-10397 for more discussion. 584 OptionsParser.LOG.warn( "-delete and -diff are mutually exclusive. " + 585 "The -delete option will be ignored." ); 586 setDeleteMissing( false ); 587 } This logic should be moved to the beginning of the valid method. I.e., the "-deleteMissing" should be bypassed before it causes any exception along with some other option settings. A new unit test will also be helpful for this scenario. +1 after addressing the comment.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Mingliang Liu,

          I think it's a very good idea to do what you did in 002. I will try to review early next week.

          Thanks a lot.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Mingliang Liu , I think it's a very good idea to do what you did in 002. I will try to review early next week. Thanks a lot.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 24s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 7m 28s trunk passed
          +1 compile 0m 19s trunk passed with JDK v1.8.0_91
          +1 compile 0m 18s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 15s trunk passed
          +1 mvnsite 0m 23s trunk passed
          +1 mvneclipse 0m 16s trunk passed
          +1 findbugs 0m 31s trunk passed
          +1 javadoc 0m 14s trunk passed with JDK v1.8.0_91
          +1 javadoc 0m 15s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 19s the patch passed
          +1 compile 0m 17s the patch passed with JDK v1.8.0_91
          +1 javac 0m 17s the patch passed
          +1 compile 0m 16s the patch passed with JDK v1.7.0_95
          +1 javac 0m 16s the patch passed
          +1 checkstyle 0m 13s hadoop-tools/hadoop-distcp: patch generated 0 new + 76 unchanged - 11 fixed = 76 total (was 87)
          +1 mvnsite 0m 21s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 0m 40s the patch passed
          +1 javadoc 0m 13s the patch passed with JDK v1.8.0_91
          +1 javadoc 0m 12s the patch passed with JDK v1.7.0_95
          +1 unit 9m 2s hadoop-distcp in the patch passed with JDK v1.8.0_91.
          +1 unit 8m 44s hadoop-distcp in the patch passed with JDK v1.7.0_95.
          +1 asflicense 0m 19s Patch does not generate ASF License warnings.
          32m 8s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:cf2ee45
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804014/HDFS-10397.002.patch
          JIRA Issue HDFS-10397
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 1598bc190d8d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 3fa1380
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15435/testReport/
          modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15435/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 24s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 28s trunk passed +1 compile 0m 19s trunk passed with JDK v1.8.0_91 +1 compile 0m 18s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 15s trunk passed +1 mvnsite 0m 23s trunk passed +1 mvneclipse 0m 16s trunk passed +1 findbugs 0m 31s trunk passed +1 javadoc 0m 14s trunk passed with JDK v1.8.0_91 +1 javadoc 0m 15s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 19s the patch passed +1 compile 0m 17s the patch passed with JDK v1.8.0_91 +1 javac 0m 17s the patch passed +1 compile 0m 16s the patch passed with JDK v1.7.0_95 +1 javac 0m 16s the patch passed +1 checkstyle 0m 13s hadoop-tools/hadoop-distcp: patch generated 0 new + 76 unchanged - 11 fixed = 76 total (was 87) +1 mvnsite 0m 21s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 0m 40s the patch passed +1 javadoc 0m 13s the patch passed with JDK v1.8.0_91 +1 javadoc 0m 12s the patch passed with JDK v1.7.0_95 +1 unit 9m 2s hadoop-distcp in the patch passed with JDK v1.8.0_91. +1 unit 8m 44s hadoop-distcp in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 19s Patch does not generate ASF License warnings. 32m 8s Subsystem Report/Notes Docker Image:yetus/hadoop:cf2ee45 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804014/HDFS-10397.002.patch JIRA Issue HDFS-10397 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 1598bc190d8d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 3fa1380 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15435/testReport/ modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15435/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks Yongjun Zhang for the insightful suggestion. This motivated my v2 patch.

          The pain in the code to handle different option combinations comes from the fact that, for each option we may validate and set it individually. This is not a clear way as 1) not efficient, 2) not well defined, and 3) error prone.

          1. For point 1) we validate the options multiple times which is not needed or scalable.
          2. For 2) some of the options are set after validation while the other options are set without validation. Distributing the decision to validate or not to validate across all the setters smells bad to me.
          3. For 3), when we validate an option, chances are that its dependent option B is not set yet. This implies that the order of setting options have to be carefully chosen, leading to fragile code snippet. Take syncFolder and skipCRC for example, skip CRC is valid only with update options, and if we set (and thus validate) skipCRC before setting syncFolder option, the validation will fail, even if both of them are provided in the command line.

          I think a better way is to validate all the options only once after all the options are set, i.e. a central validation method. Moreover, the parser is to parse the options and should not handle the validation of option combinations explicitly, if it's possible to delegate the work to validate() method of DistCpOptions. Of course, if there is any parsing errors of a single option (eg. only one snapshot is provided for the -diff option), the parser should throw the IllegalArgumentException directly.

          What's your thought?

          Ping Jing Zhao for more input.

          Show
          liuml07 Mingliang Liu added a comment - Thanks Yongjun Zhang for the insightful suggestion. This motivated my v2 patch. The pain in the code to handle different option combinations comes from the fact that, for each option we may validate and set it individually. This is not a clear way as 1) not efficient, 2) not well defined, and 3) error prone. For point 1) we validate the options multiple times which is not needed or scalable. For 2) some of the options are set after validation while the other options are set without validation. Distributing the decision to validate or not to validate across all the setters smells bad to me. For 3), when we validate an option, chances are that its dependent option B is not set yet. This implies that the order of setting options have to be carefully chosen, leading to fragile code snippet. Take syncFolder and skipCRC for example, skip CRC is valid only with update options, and if we set (and thus validate) skipCRC before setting syncFolder option, the validation will fail, even if both of them are provided in the command line. I think a better way is to validate all the options only once after all the options are set, i.e. a central validation method. Moreover, the parser is to parse the options and should not handle the validation of option combinations explicitly, if it's possible to delegate the work to validate() method of DistCpOptions . Of course, if there is any parsing errors of a single option (eg. only one snapshot is provided for the -diff option), the parser should throw the IllegalArgumentException directly. What's your thought? Ping Jing Zhao for more input.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Mingliang Liu,

          Thanks for the patch. I did a quick browse, and have the following suggestion. Instead of changing validate method, I think we can possibly change the OptionParser#parse method, something like

          boolean deleteMissing = command.hasOption(DistCpOptionSwitch.DELETE_MISSING.getSwitch());
          boolean diff = command.hasOption(DistCpOptionSwitch.DIFF.getSwitch());
          boolean ignoreDeleteMissing = deleteMissing && diff;
          if (ignoreDeleteMissing) {
           // issue warning message
          } else if (deleteMissing) {
           //set deleteMissing
          }
          
          if (diff) {
            // set diff
          }
          

          Basically let the parser to decide whether to ignore some switches, and let DistCpOption#validate to decide other invalid situations and throw InvalidArguement exception. This seems cleaner to me.

          What do you think?

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Mingliang Liu , Thanks for the patch. I did a quick browse, and have the following suggestion. Instead of changing validate method, I think we can possibly change the OptionParser#parse method, something like boolean deleteMissing = command.hasOption(DistCpOptionSwitch.DELETE_MISSING.getSwitch()); boolean diff = command.hasOption(DistCpOptionSwitch.DIFF.getSwitch()); boolean ignoreDeleteMissing = deleteMissing && diff; if (ignoreDeleteMissing) { // issue warning message } else if (deleteMissing) { //set deleteMissing } if (diff) { // set diff } Basically let the parser to decide whether to ignore some switches, and let DistCpOption#validate to decide other invalid situations and throw InvalidArguement exception. This seems cleaner to me. What do you think? Thanks.
          Hide
          liuml07 Mingliang Liu added a comment -

          As we setDeleteMissing before setUseDiff, the DistCpOptions#validate will meet the conflicting options only when validating -diff option via setUseDiff. As v0 patch did, a simple fix is to reset the -delete option in the DistCpOptions#validate.

          Per offline discussion with Jing Zhao, this solution is error prone. There is no strictly defined parsing order of the options. Specifically, if we setUseDiff before setDeleteMissing, then setDeleteMissing will pass the validation and setting a true value for -delete, which is not expected.

          V1 patch is to address this problem.

          Show
          liuml07 Mingliang Liu added a comment - As we setDeleteMissing before setUseDiff , the DistCpOptions#validate will meet the conflicting options only when validating -diff option via setUseDiff . As v0 patch did, a simple fix is to reset the -delete option in the DistCpOptions#validate . Per offline discussion with Jing Zhao , this solution is error prone. There is no strictly defined parsing order of the options. Specifically, if we setUseDiff before setDeleteMissing , then setDeleteMissing will pass the validation and setting a true value for -delete , which is not expected. V1 patch is to address this problem.
          Hide
          liuml07 Mingliang Liu added a comment -

          The v0 patch is to address this problem. Please kindly review.

          Show
          liuml07 Mingliang Liu added a comment - The v0 patch is to address this problem. Please kindly review.
          Hide
          liuml07 Mingliang Liu added a comment -

          Sample error message:

          2016-05-05 06:41:57,429|beaver.machine|INFO|23004|140021173225216|MainThread|RUNNING: /usr/hdp/current/hadoop-client/bin/hadoop distcp -update -delete -diff s1 s2  /tmp/source /tmp/target
          2016-05-05 06:41:58,461|beaver.machine|INFO|23004|140021173225216|MainThread|16/05/05 06:41:58 ERROR tools.DistCp: Invalid arguments:
          2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|java.lang.IllegalArgumentException: Diff is valid only with update options
          2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:595)
          2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.tools.DistCpOptions.setUseDiff(DistCpOptions.java:287)
          2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.tools.OptionsParser.parse(OptionsParser.java:235)
          2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.tools.DistCp.run(DistCp.java:115)
          2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
          2016-05-05 06:41:58,463|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.tools.DistCp.main(DistCp.java:453)
          2016-05-05 06:41:58,463|beaver.machine|INFO|23004|140021173225216|MainThread|Invalid arguments: Diff is valid only with update options
          
          Show
          liuml07 Mingliang Liu added a comment - Sample error message: 2016-05-05 06:41:57,429|beaver.machine|INFO|23004|140021173225216|MainThread|RUNNING: /usr/hdp/current/hadoop-client/bin/hadoop distcp -update -delete -diff s1 s2 /tmp/source /tmp/target 2016-05-05 06:41:58,461|beaver.machine|INFO|23004|140021173225216|MainThread|16/05/05 06:41:58 ERROR tools.DistCp: Invalid arguments: 2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|java.lang.IllegalArgumentException: Diff is valid only with update options 2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:595) 2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.tools.DistCpOptions.setUseDiff(DistCpOptions.java:287) 2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.tools.OptionsParser.parse(OptionsParser.java:235) 2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.tools.DistCp.run(DistCp.java:115) 2016-05-05 06:41:58,462|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) 2016-05-05 06:41:58,463|beaver.machine|INFO|23004|140021173225216|MainThread|at org.apache.hadoop.tools.DistCp.main(DistCp.java:453) 2016-05-05 06:41:58,463|beaver.machine|INFO|23004|140021173225216|MainThread|Invalid arguments: Diff is valid only with update options

            People

            • Assignee:
              liuml07 Mingliang Liu
              Reporter:
              liuml07 Mingliang Liu
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development