Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13626

Remove distcp dependency on FileStatus serialization

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-alpha2
    • Component/s: tools/distcp
    • Labels:
      None

      Description

      DistCp uses an internal struct CopyListingFileStatus to record metadata. Because this record extends FileStatus, it also relies on the Writable contract from that type. Because DistCp performs its checks on a subset of the fields (i.e., does not actually rely on FileStatus as a supertype), these types should be independent.

      1. HADOOP-13626.001.patch
        23 kB
        Chris Douglas
      2. HADOOP-13626.002.patch
        23 kB
        Chris Douglas
      3. HADOOP-13626.003.patch
        23 kB
        Chris Douglas
      4. HADOOP-13626.004.patch
        23 kB
        Chris Douglas

        Issue Links

          Activity

          Hide
          chris.douglas Chris Douglas added a comment -

          Missed some checkstyle complaints.

          Show
          chris.douglas Chris Douglas added a comment - Missed some checkstyle complaints.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 11s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 6m 59s trunk passed
          +1 compile 0m 17s trunk passed
          +1 checkstyle 0m 13s trunk passed
          +1 mvnsite 0m 20s trunk passed
          +1 mvneclipse 0m 17s trunk passed
          +1 findbugs 0m 24s trunk passed
          +1 javadoc 0m 12s trunk passed
          +1 mvninstall 0m 16s the patch passed
          +1 compile 0m 14s the patch passed
          +1 javac 0m 14s the patch passed
          +1 checkstyle 0m 11s hadoop-tools/hadoop-distcp: The patch generated 0 new + 36 unchanged - 3 fixed = 36 total (was 39)
          +1 mvnsite 0m 18s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          +1 findbugs 0m 29s the patch passed
          +1 javadoc 0m 10s hadoop-tools_hadoop-distcp generated 0 new + 49 unchanged - 1 fixed = 49 total (was 50)
          +1 unit 9m 32s hadoop-distcp in the patch passed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          21m 42s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Issue HADOOP-13626
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12829267/HADOOP-13626.002.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 4e0c5c0e087e 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 7558dbb
          Default Java 1.8.0_101
          findbugs v3.0.0
          whitespace https://builds.apache.org/job/PreCommit-HADOOP-Build/10544/artifact/patchprocess/whitespace-eol.txt
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/10544/testReport/
          modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/10544/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 11s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 6m 59s trunk passed +1 compile 0m 17s trunk passed +1 checkstyle 0m 13s trunk passed +1 mvnsite 0m 20s trunk passed +1 mvneclipse 0m 17s trunk passed +1 findbugs 0m 24s trunk passed +1 javadoc 0m 12s trunk passed +1 mvninstall 0m 16s the patch passed +1 compile 0m 14s the patch passed +1 javac 0m 14s the patch passed +1 checkstyle 0m 11s hadoop-tools/hadoop-distcp: The patch generated 0 new + 36 unchanged - 3 fixed = 36 total (was 39) +1 mvnsite 0m 18s the patch passed +1 mvneclipse 0m 10s the patch passed -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply +1 findbugs 0m 29s the patch passed +1 javadoc 0m 10s hadoop-tools_hadoop-distcp generated 0 new + 49 unchanged - 1 fixed = 49 total (was 50) +1 unit 9m 32s hadoop-distcp in the patch passed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 21m 42s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue HADOOP-13626 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12829267/HADOOP-13626.002.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 4e0c5c0e087e 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 7558dbb Default Java 1.8.0_101 findbugs v3.0.0 whitespace https://builds.apache.org/job/PreCommit-HADOOP-Build/10544/artifact/patchprocess/whitespace-eol.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/10544/testReport/ modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/10544/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 6m 53s trunk passed
          +1 compile 0m 17s trunk passed
          +1 checkstyle 0m 14s trunk passed
          +1 mvnsite 0m 20s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 24s trunk passed
          +1 javadoc 0m 14s trunk passed
          +1 mvninstall 0m 16s the patch passed
          +1 compile 0m 15s the patch passed
          +1 javac 0m 15s the patch passed
          +1 checkstyle 0m 10s hadoop-tools/hadoop-distcp: The patch generated 0 new + 36 unchanged - 3 fixed = 36 total (was 39)
          +1 mvnsite 0m 17s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 29s the patch passed
          +1 javadoc 0m 10s hadoop-tools_hadoop-distcp generated 0 new + 49 unchanged - 1 fixed = 49 total (was 50)
          +1 unit 9m 15s hadoop-distcp in the patch passed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          21m 20s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Issue HADOOP-13626
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12829275/HADOOP-13626.003.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 74dce1e90c5d 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 7558dbb
          Default Java 1.8.0_101
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/10545/testReport/
          modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/10545/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 6m 53s trunk passed +1 compile 0m 17s trunk passed +1 checkstyle 0m 14s trunk passed +1 mvnsite 0m 20s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 24s trunk passed +1 javadoc 0m 14s trunk passed +1 mvninstall 0m 16s the patch passed +1 compile 0m 15s the patch passed +1 javac 0m 15s the patch passed +1 checkstyle 0m 10s hadoop-tools/hadoop-distcp: The patch generated 0 new + 36 unchanged - 3 fixed = 36 total (was 39) +1 mvnsite 0m 17s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 29s the patch passed +1 javadoc 0m 10s hadoop-tools_hadoop-distcp generated 0 new + 49 unchanged - 1 fixed = 49 total (was 50) +1 unit 9m 15s hadoop-distcp in the patch passed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 21m 20s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue HADOOP-13626 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12829275/HADOOP-13626.003.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 74dce1e90c5d 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 7558dbb Default Java 1.8.0_101 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/10545/testReport/ modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/10545/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          chris.douglas Chris Douglas added a comment -

          Chris Nauroth could you take a look? Would like to commit this before HDFS-6984; you'd cited this as a todo over there.

          Show
          chris.douglas Chris Douglas added a comment - Chris Nauroth could you take a look? Would like to commit this before HDFS-6984 ; you'd cited this as a todo over there.
          Hide
          liuml07 Mingliang Liu added a comment -

          The patch looks good to me overall. +1

          I have a few trivial comments:

          1. Can you also update the javadoc of CopyListingFileStatus class?
            /**
             * CopyListingFileStatus is a specialized subclass of {@link FileStatus} for
             * attaching additional data members useful to distcp.  This class does not
             * override {@link FileStatus#compareTo}, because the additional data members
             * are not relevant to sort order.
             */
            

            This will be so not true.

          2. Can we not use wildcard if the classes are not all imported? I personally got pain when backporting code to our internal branches.
            TestCopyListingFileStatus.java
            27	import static org.junit.Assert.*;
            
          3. assertEquals has signature assertEquals(expected, actual). The failing message will be confusing if the order is not take care of.
            55	    assertEquals(clfs.getLen(), stat.getLen());
            56	    assertEquals(clfs.isDirectory(), stat.isDirectory());
            57	    assertEquals(clfs.getReplication(), stat.getReplication());
            58	    assertEquals(clfs.getBlockSize(), stat.getBlockSize());
            59	    assertEquals(clfs.getAccessTime(), stat.getAccessTime());
            ...
            

            Here I think stat's values are expected?

          Show
          liuml07 Mingliang Liu added a comment - The patch looks good to me overall. +1 I have a few trivial comments: Can you also update the javadoc of CopyListingFileStatus class? /** * CopyListingFileStatus is a specialized subclass of {@link FileStatus} for * attaching additional data members useful to distcp. This class does not * override {@link FileStatus#compareTo}, because the additional data members * are not relevant to sort order. */ This will be so not true. Can we not use wildcard if the classes are not all imported? I personally got pain when backporting code to our internal branches. TestCopyListingFileStatus.java 27 import static org.junit.Assert.*; assertEquals has signature assertEquals(expected, actual) . The failing message will be confusing if the order is not take care of. 55 assertEquals(clfs.getLen(), stat.getLen()); 56 assertEquals(clfs.isDirectory(), stat.isDirectory()); 57 assertEquals(clfs.getReplication(), stat.getReplication()); 58 assertEquals(clfs.getBlockSize(), stat.getBlockSize()); 59 assertEquals(clfs.getAccessTime(), stat.getAccessTime()); ... Here I think stat's values are expected?
          Hide
          chris.douglas Chris Douglas added a comment -

          Thanks, Mingliang Liu. Integrated all your review comments in v004, will commit if Jenkins comes back clean.

          Show
          chris.douglas Chris Douglas added a comment - Thanks, Mingliang Liu . Integrated all your review comments in v004, will commit if Jenkins comes back clean.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 16s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 7m 23s trunk passed
          +1 compile 0m 17s trunk passed
          +1 checkstyle 0m 14s trunk passed
          +1 mvnsite 0m 23s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 0m 30s trunk passed
          +1 javadoc 0m 14s trunk passed
          +1 mvninstall 0m 19s the patch passed
          +1 compile 0m 17s the patch passed
          +1 javac 0m 17s the patch passed
          +1 checkstyle 0m 12s hadoop-tools/hadoop-distcp: The patch generated 0 new + 36 unchanged - 3 fixed = 36 total (was 39)
          +1 mvnsite 0m 21s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 35s the patch passed
          +1 javadoc 0m 13s hadoop-tools_hadoop-distcp generated 0 new + 49 unchanged - 1 fixed = 49 total (was 50)
          +1 unit 11m 42s hadoop-distcp in the patch passed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          24m 58s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Issue HADOOP-13626
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12834997/HADOOP-13626.004.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 8277f3b98167 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / b18f35f
          Default Java 1.8.0_101
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/10878/testReport/
          modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/10878/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 23s trunk passed +1 compile 0m 17s trunk passed +1 checkstyle 0m 14s trunk passed +1 mvnsite 0m 23s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 0m 30s trunk passed +1 javadoc 0m 14s trunk passed +1 mvninstall 0m 19s the patch passed +1 compile 0m 17s the patch passed +1 javac 0m 17s the patch passed +1 checkstyle 0m 12s hadoop-tools/hadoop-distcp: The patch generated 0 new + 36 unchanged - 3 fixed = 36 total (was 39) +1 mvnsite 0m 21s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 35s the patch passed +1 javadoc 0m 13s hadoop-tools_hadoop-distcp generated 0 new + 49 unchanged - 1 fixed = 49 total (was 50) +1 unit 11m 42s hadoop-distcp in the patch passed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 24m 58s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue HADOOP-13626 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12834997/HADOOP-13626.004.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 8277f3b98167 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / b18f35f Default Java 1.8.0_101 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/10878/testReport/ modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/10878/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          chris.douglas Chris Douglas added a comment -

          I committed this.

          Show
          chris.douglas Chris Douglas added a comment - I committed this.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10666 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10666/)
          HADOOP-13626. Remove distcp dependency on FileStatus serialization (cdouglas: rev a1a0281e12ea96476e75b076f76d5b5eb5254eea)

          • (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/CopyListingFileStatus.java
          • (add) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestCopyListingFileStatus.java
          • (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyMapper.java
          • (edit) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/TestRetriableFileCopyCommand.java
          • (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10666 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10666/ ) HADOOP-13626 . Remove distcp dependency on FileStatus serialization (cdouglas: rev a1a0281e12ea96476e75b076f76d5b5eb5254eea) (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/CopyListingFileStatus.java (add) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestCopyListingFileStatus.java (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyMapper.java (edit) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/TestRetriableFileCopyCommand.java (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Chris Douglas,

          Thanks for your work here. I found that if we backport this to branch-2, it would make backport of HADOOP-11794 easier. Wonder if there is any concern backporting this to branch-2? I can do it if there is not.

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Chris Douglas , Thanks for your work here. I found that if we backport this to branch-2, it would make backport of HADOOP-11794 easier. Wonder if there is any concern backporting this to branch-2? I can do it if there is not. Thanks.
          Hide
          chris.douglas Chris Douglas added a comment -

          It should be harmless to backport to branch-2

          Show
          chris.douglas Chris Douglas added a comment - It should be harmless to backport to branch-2
          Hide
          yzhangal Yongjun Zhang added a comment -

          Thanks Chris Douglas, committed to branch-2.

          Show
          yzhangal Yongjun Zhang added a comment - Thanks Chris Douglas , committed to branch-2.

            People

            • Assignee:
              chris.douglas Chris Douglas
              Reporter:
              chris.douglas Chris Douglas
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development