Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.0.0-alpha4
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      If -p option of distcp command is unspecified, block size is preserved.

      Description

      We should have the preserve blocksize (-pb) on in distcp by default.

      checksum which is on by default will always fail if blocksize is not the same.

      1. HADOOP-8143.1.patch
        2 kB
        Mithun Radhakrishnan
      2. HADOOP-8143.2.patch
        2 kB
        Mithun Radhakrishnan
      3. HADOOP-8143.3.patch
        2 kB
        Mithun Radhakrishnan

        Issue Links

          Activity

          Hide
          atm Aaron T. Myers added a comment -

          Hey Dave, can you comment on what version of Hadoop you were running where you encountered this issue? If so, could you please update the "affects version" and "target version" fields? Thanks a lot.

          Show
          atm Aaron T. Myers added a comment - Hey Dave, can you comment on what version of Hadoop you were running where you encountered this issue? If so, could you please update the "affects version" and "target version" fields? Thanks a lot.
          Hide
          aw Allen Wittenauer added a comment -

          Checksuming should get fixed vs. forcing the block size. I suspect forcing block size will break non-HDFS methods in surprising ways.

          Show
          aw Allen Wittenauer added a comment - Checksuming should get fixed vs. forcing the block size. I suspect forcing block size will break non-HDFS methods in surprising ways.
          Hide
          davet Dave Thompson added a comment -

          Sounds good Allen. I've considered your comment and decided on an alternate approach. I've created HADOOP-8233, which targets dropping checksum on different blocksize between source and target.

          Show
          davet Dave Thompson added a comment - Sounds good Allen. I've considered your comment and decided on an alternate approach. I've created HADOOP-8233 , which targets dropping checksum on different blocksize between source and target.
          Hide
          davet Dave Thompson added a comment -

          Closing out issue for an alternate approach.

          Show
          davet Dave Thompson added a comment - Closing out issue for an alternate approach.
          Hide
          mithun Mithun Radhakrishnan added a comment -

          Chaps, would it be ok if we revisited ask?

          1. Dave Thompson's original problem remains, i.e. copying files between 2 clusters with different default block-sizes will fail, without either -pb or -skipCrc. HADOOP-8233 only solves this for 0-byte files.

          2. File-formats such as ORC perform several optimizations w.r.t. data-stripes and HDFS-block-sizes. If such files were to be copied between clusters without preserving block-sizes, there would ensue performance-fails (at best) or data-corruptions (at worst).

          Would it be acceptable to preserve block-sizes by default (i.e. if -p isn't used), only if the source and target file-systems are HDFS?

          Show
          mithun Mithun Radhakrishnan added a comment - Chaps, would it be ok if we revisited ask? 1. Dave Thompson 's original problem remains, i.e. copying files between 2 clusters with different default block-sizes will fail, without either -pb or -skipCrc. HADOOP-8233 only solves this for 0-byte files. 2. File-formats such as ORC perform several optimizations w.r.t. data-stripes and HDFS-block-sizes. If such files were to be copied between clusters without preserving block-sizes, there would ensue performance-fails (at best) or data-corruptions (at worst). Would it be acceptable to preserve block-sizes by default (i.e. if -p isn't used), only if the source and target file-systems are HDFS?
          Hide
          mithun Mithun Radhakrishnan added a comment -

          Tentative fix. This preserves block-size by default (but only if -p isn't specified at all). This assumes that if the user said -pug, then block-size was deliberately left out.

          Show
          mithun Mithun Radhakrishnan added a comment - Tentative fix. This preserves block-size by default (but only if -p isn't specified at all). This assumes that if the user said -pug , then block-size was deliberately left out.
          Hide
          mithun Mithun Radhakrishnan added a comment -

          Allen Wittenauer

          forcing block size will break non-HDFS methods in surprising ways.

          Here's the code in DistCp that is affected by preserving block-size:

            private static long getBlockSize(
                    EnumSet<FileAttribute> fileAttributes,
                    FileStatus sourceFile, FileSystem targetFS, Path tmpTargetPath) {
              boolean preserve = fileAttributes.contains(FileAttribute.BLOCKSIZE)
                  || fileAttributes.contains(FileAttribute.CHECKSUMTYPE);
              return preserve ? sourceFile.getBlockSize() : targetFS
                  .getDefaultBlockSize(tmpTargetPath);
            }
          

          Would the concern be that FileStatus.getBlockSize() might conk if the source-file isn't on HDFS? It's more likely that FileSystem.getDefaultBlockSize() is being called for a non-HDFS file-system as well, by default.

          Show
          mithun Mithun Radhakrishnan added a comment - Allen Wittenauer forcing block size will break non-HDFS methods in surprising ways. Here's the code in DistCp that is affected by preserving block-size: private static long getBlockSize( EnumSet<FileAttribute> fileAttributes, FileStatus sourceFile, FileSystem targetFS, Path tmpTargetPath) { boolean preserve = fileAttributes.contains(FileAttribute.BLOCKSIZE) || fileAttributes.contains(FileAttribute.CHECKSUMTYPE); return preserve ? sourceFile.getBlockSize() : targetFS .getDefaultBlockSize(tmpTargetPath); } Would the concern be that FileStatus.getBlockSize() might conk if the source-file isn't on HDFS? It's more likely that FileSystem.getDefaultBlockSize() is being called for a non-HDFS file-system as well, by default.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12678063/HADOOP-8143.1.patch
          against trunk revision 6f5f604.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapred.TestMRTimelineEventHandling
          org.apache.hadoop.yarn.client.TestGetGroups
          org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
          org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
          org.apache.hadoop.yarn.client.TestApplicationMasterServiceProtocolOnHA
          org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
          org.apache.hadoop.yarn.client.TestRMFailover
          org.apache.hadoop.yarn.client.api.impl.TestYarnClient
          org.apache.hadoop.yarn.client.cli.TestYarnCLI
          org.apache.hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl
          org.apache.hadoop.yarn.client.api.impl.TestNMClient

          The test build failed in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4985//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4985//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678063/HADOOP-8143.1.patch against trunk revision 6f5f604. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapred.TestMRTimelineEventHandling org.apache.hadoop.yarn.client.TestGetGroups org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA org.apache.hadoop.yarn.client.api.impl.TestAMRMClient org.apache.hadoop.yarn.client.TestApplicationMasterServiceProtocolOnHA org.apache.hadoop.yarn.client.TestResourceTrackerOnHA org.apache.hadoop.yarn.client.TestRMFailover org.apache.hadoop.yarn.client.api.impl.TestYarnClient org.apache.hadoop.yarn.client.cli.TestYarnCLI org.apache.hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl org.apache.hadoop.yarn.client.api.impl.TestNMClient The test build failed in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4985//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4985//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          If such files were to be copied between clusters without preserving block-sizes, there would ensue performance-fails (at best) or data-corruptions (at worst).

          Wow, such fragility. If data-corruption happens because the block size is changing, that sounds like a great reason not to use ORC.

          Show
          aw Allen Wittenauer added a comment - If such files were to be copied between clusters without preserving block-sizes, there would ensue performance-fails (at best) or data-corruptions (at worst). Wow, such fragility. If data-corruption happens because the block size is changing, that sounds like a great reason not to use ORC.
          Hide
          mithun Mithun Radhakrishnan added a comment -

          Hh, actually, thank you for correcting me, Allen. You're right, the "data-corruption" part is a clumsily worded overstatement, and not ORC-specific.

          1. Checksum-verifications between source and target are guaranteed to fail between files with identical contents, but different block-sizes (and span blocks). If HDFS has been working to fix this, do let me know of the JIRA. The only way to have DistCp succeed in copying them is to skip checksums. And this raises the potential for bad copies of the file, regardless of format.

          2. There's potential for performance degradation when ORC files with large stripes are copied to clusters with smaller block-sizes, if block-sizes aren't preserved.

          While #2 is of some concern, #1 is of maximum import.

          Show
          mithun Mithun Radhakrishnan added a comment - Hh, actually, thank you for correcting me, Allen. You're right, the "data-corruption" part is a clumsily worded overstatement, and not ORC-specific. 1. Checksum-verifications between source and target are guaranteed to fail between files with identical contents, but different block-sizes (and span blocks). If HDFS has been working to fix this, do let me know of the JIRA. The only way to have DistCp succeed in copying them is to skip checksums. And this raises the potential for bad copies of the file, regardless of format. 2. There's potential for performance degradation when ORC files with large stripes are copied to clusters with smaller block-sizes, if block-sizes aren't preserved. While #2 is of some concern, #1 is of maximum import.
          Hide
          jlowe Jason Lowe added a comment -

          I agree that #1 seems pretty bad if distcp is going to skip data integrity checks by default. If that's the case then that seems like a strong argument to have distcp preserve block sizes by default so it can still do checksums by default.

          Show
          jlowe Jason Lowe added a comment - I agree that #1 seems pretty bad if distcp is going to skip data integrity checks by default. If that's the case then that seems like a strong argument to have distcp preserve block sizes by default so it can still do checksums by default.
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12678063/HADOOP-8143.1.patch
          against trunk revision e17e5ba.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-tools/hadoop-distcp.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5834//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5834//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678063/HADOOP-8143.1.patch against trunk revision e17e5ba. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-tools/hadoop-distcp. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5834//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5834//console This message is automatically generated.
          Hide
          sushanth Sushanth Sowmyan added a comment -

          Hi,

          It looks like this patch might have gotten lost in limbo - is there a target version we can see this patch in hadoop by? As more tools like hive/falcon auto-use distcp behind the scenes, the more important this gets.

          Show
          sushanth Sushanth Sowmyan added a comment - Hi, It looks like this patch might have gotten lost in limbo - is there a target version we can see this patch in hadoop by? As more tools like hive/falcon auto-use distcp behind the scenes, the more important this gets.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 9m 21s trunk passed
          +1 compile 0m 27s trunk passed with JDK v1.8.0_74
          +1 compile 0m 25s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 20s trunk passed
          +1 mvnsite 0m 31s trunk passed
          +1 mvneclipse 0m 19s trunk passed
          +1 findbugs 0m 38s trunk passed
          +1 javadoc 0m 21s trunk passed with JDK v1.8.0_74
          +1 javadoc 0m 20s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 24s the patch passed
          +1 compile 0m 22s the patch passed with JDK v1.8.0_74
          +1 javac 0m 22s the patch passed
          +1 compile 0m 20s the patch passed with JDK v1.7.0_95
          +1 javac 0m 20s the patch passed
          -1 checkstyle 0m 14s hadoop-tools/hadoop-distcp: patch generated 1 new + 13 unchanged - 0 fixed = 14 total (was 13)
          +1 mvnsite 0m 26s the patch passed
          +1 mvneclipse 0m 14s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 0m 50s the patch passed
          +1 javadoc 0m 18s the patch passed with JDK v1.8.0_74
          +1 javadoc 0m 16s the patch passed with JDK v1.7.0_95
          -1 unit 10m 4s hadoop-distcp in the patch failed with JDK v1.8.0_74.
          -1 unit 9m 31s hadoop-distcp in the patch failed with JDK v1.7.0_95.
          -1 asflicense 0m 24s Patch generated 1 ASF License warnings.
          37m 44s



          Reason Tests
          JDK v1.8.0_74 Failed junit tests hadoop.tools.TestOptionsParser
          JDK v1.7.0_95 Failed junit tests hadoop.tools.TestOptionsParser



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:fbe3e86
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12678063/HADOOP-8143.1.patch
          JIRA Issue HADOOP-8143
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 88cfdb4227ab 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 948b758
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-distcp.txt
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-distcp-jdk1.8.0_74.txt
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-distcp-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 9m 21s trunk passed +1 compile 0m 27s trunk passed with JDK v1.8.0_74 +1 compile 0m 25s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 31s trunk passed +1 mvneclipse 0m 19s trunk passed +1 findbugs 0m 38s trunk passed +1 javadoc 0m 21s trunk passed with JDK v1.8.0_74 +1 javadoc 0m 20s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 24s the patch passed +1 compile 0m 22s the patch passed with JDK v1.8.0_74 +1 javac 0m 22s the patch passed +1 compile 0m 20s the patch passed with JDK v1.7.0_95 +1 javac 0m 20s the patch passed -1 checkstyle 0m 14s hadoop-tools/hadoop-distcp: patch generated 1 new + 13 unchanged - 0 fixed = 14 total (was 13) +1 mvnsite 0m 26s the patch passed +1 mvneclipse 0m 14s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 0m 50s the patch passed +1 javadoc 0m 18s the patch passed with JDK v1.8.0_74 +1 javadoc 0m 16s the patch passed with JDK v1.7.0_95 -1 unit 10m 4s hadoop-distcp in the patch failed with JDK v1.8.0_74. -1 unit 9m 31s hadoop-distcp in the patch failed with JDK v1.7.0_95. -1 asflicense 0m 24s Patch generated 1 ASF License warnings. 37m 44s Reason Tests JDK v1.8.0_74 Failed junit tests hadoop.tools.TestOptionsParser JDK v1.7.0_95 Failed junit tests hadoop.tools.TestOptionsParser Subsystem Report/Notes Docker Image:yetus/hadoop:fbe3e86 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12678063/HADOOP-8143.1.patch JIRA Issue HADOOP-8143 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 88cfdb4227ab 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 948b758 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-distcp.txt unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-distcp-jdk1.8.0_74.txt unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-distcp-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-distcp-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/testReport/ asflicense https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8945/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          mithun Mithun Radhakrishnan added a comment -

          It's that time of year again when one wonders whether this fix may be considered for submission. :]
          How about it, chaps?

          Show
          mithun Mithun Radhakrishnan added a comment - It's that time of year again when one wonders whether this fix may be considered for submission. :] How about it, chaps?
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 patch 0m 4s HADOOP-8143 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help.



          Subsystem Report/Notes
          JIRA Issue HADOOP-8143
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12678063/HADOOP-8143.1.patch
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11481/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 patch 0m 4s HADOOP-8143 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. Subsystem Report/Notes JIRA Issue HADOOP-8143 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12678063/HADOOP-8143.1.patch Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11481/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          My apologies, this got lost again, and the patch needs a refresh.

          I'm OK with this going into Hadoop 3.x and leaning towards allowing it into 2.x. Allen Wittenauer do you have any thoughts on where this should be applied?

          Show
          jlowe Jason Lowe added a comment - My apologies, this got lost again, and the patch needs a refresh. I'm OK with this going into Hadoop 3.x and leaning towards allowing it into 2.x. Allen Wittenauer do you have any thoughts on where this should be applied?
          Hide
          aw Allen Wittenauer added a comment -

          It's an incompatible and surprising change, so it can't go into branch-2.

          Show
          aw Allen Wittenauer added a comment - It's an incompatible and surprising change, so it can't go into branch-2.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 patch 0m 4s HADOOP-8143 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help.



          Subsystem Report/Notes
          JIRA Issue HADOOP-8143
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12678063/HADOOP-8143.1.patch
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/12568/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 patch 0m 4s HADOOP-8143 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. Subsystem Report/Notes JIRA Issue HADOOP-8143 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12678063/HADOOP-8143.1.patch Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/12568/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          mithun Mithun Radhakrishnan added a comment -

          Rebased to work with changes on trunk.

          Show
          mithun Mithun Radhakrishnan added a comment - Rebased to work with changes on trunk.
          Hide
          mithun Mithun Radhakrishnan added a comment -

          Re-submitting for tests.

          Show
          mithun Mithun Radhakrishnan added a comment - Re-submitting for tests.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 13m 42s trunk passed
          +1 compile 0m 21s trunk passed
          +1 checkstyle 0m 15s trunk passed
          +1 mvnsite 0m 21s trunk passed
          +1 findbugs 0m 27s trunk passed
          +1 javadoc 0m 14s trunk passed
          +1 mvninstall 0m 20s the patch passed
          +1 compile 0m 17s the patch passed
          +1 javac 0m 17s the patch passed
          -0 checkstyle 0m 12s hadoop-tools/hadoop-distcp: The patch generated 1 new + 29 unchanged - 0 fixed = 30 total (was 29)
          +1 mvnsite 0m 20s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 34s the patch passed
          +1 javadoc 0m 12s the patch passed
          +1 unit 13m 13s hadoop-distcp in the patch passed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          32m 16s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HADOOP-8143
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12873536/HADOOP-8143.2.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux b399bd038e45 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 73fb750
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/12569/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-distcp.txt
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/12569/testReport/
          modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/12569/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 13m 42s trunk passed +1 compile 0m 21s trunk passed +1 checkstyle 0m 15s trunk passed +1 mvnsite 0m 21s trunk passed +1 findbugs 0m 27s trunk passed +1 javadoc 0m 14s trunk passed +1 mvninstall 0m 20s the patch passed +1 compile 0m 17s the patch passed +1 javac 0m 17s the patch passed -0 checkstyle 0m 12s hadoop-tools/hadoop-distcp: The patch generated 1 new + 29 unchanged - 0 fixed = 30 total (was 29) +1 mvnsite 0m 20s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 34s the patch passed +1 javadoc 0m 12s the patch passed +1 unit 13m 13s hadoop-distcp in the patch passed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 32m 16s Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HADOOP-8143 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12873536/HADOOP-8143.2.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux b399bd038e45 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 73fb750 Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/12569/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-distcp.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/12569/testReport/ modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/12569/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks for updating the patch! Would be nice to fix the checkstyle issue due to the else on the separate line. Once that's fixed I'd be happy to commit this to trunk.

          Show
          jlowe Jason Lowe added a comment - Thanks for updating the patch! Would be nice to fix the checkstyle issue due to the else on the separate line. Once that's fixed I'd be happy to commit this to trunk.
          Hide
          mithun Mithun Radhakrishnan added a comment -

          Sorry, just saw that. Here's the correction.

          Show
          mithun Mithun Radhakrishnan added a comment - Sorry, just saw that. Here's the correction.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 13m 20s trunk passed
          +1 compile 0m 18s trunk passed
          +1 checkstyle 0m 14s trunk passed
          +1 mvnsite 0m 21s trunk passed
          +1 findbugs 0m 25s trunk passed
          +1 javadoc 0m 14s trunk passed
          +1 mvninstall 0m 17s the patch passed
          +1 compile 0m 15s the patch passed
          +1 javac 0m 15s the patch passed
          +1 checkstyle 0m 11s the patch passed
          +1 mvnsite 0m 18s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 30s the patch passed
          +1 javadoc 0m 11s the patch passed
          +1 unit 13m 22s hadoop-distcp in the patch passed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          31m 39s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HADOOP-8143
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12873551/HADOOP-8143.3.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 9382f26c35f7 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 73fb750
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/12570/testReport/
          modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/12570/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 13m 20s trunk passed +1 compile 0m 18s trunk passed +1 checkstyle 0m 14s trunk passed +1 mvnsite 0m 21s trunk passed +1 findbugs 0m 25s trunk passed +1 javadoc 0m 14s trunk passed +1 mvninstall 0m 17s the patch passed +1 compile 0m 15s the patch passed +1 javac 0m 15s the patch passed +1 checkstyle 0m 11s the patch passed +1 mvnsite 0m 18s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 30s the patch passed +1 javadoc 0m 11s the patch passed +1 unit 13m 22s hadoop-distcp in the patch passed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 31m 39s Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HADOOP-8143 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12873551/HADOOP-8143.3.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 9382f26c35f7 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 73fb750 Default Java 1.8.0_131 findbugs v3.1.0-RC1 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/12570/testReport/ modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/12570/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          +1 for the latest patch. Committing this.

          Show
          jlowe Jason Lowe added a comment - +1 for the latest patch. Committing this.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks to Mithun Radhakrishnan for the contribution and to Allen Wittenauer for additional review! I committed this to trunk.

          Show
          jlowe Jason Lowe added a comment - Thanks to Mithun Radhakrishnan for the contribution and to Allen Wittenauer for additional review! I committed this to trunk.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11895 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11895/)
          HADOOP-8143. Change distcp to have -pb on by default. Contributed by (jlowe: rev dd65eea74b1f9dde858ff34df8111e5340115511)

          • (edit) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestOptionsParser.java
          • (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11895 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11895/ ) HADOOP-8143 . Change distcp to have -pb on by default. Contributed by (jlowe: rev dd65eea74b1f9dde858ff34df8111e5340115511) (edit) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestOptionsParser.java (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          I am sorry I just noticed this now.
          This incompatible deserves a release note. Would anyone up for a short description? I would also recommend updating the DistCp documentation to note this change.

          Show
          jojochuang Wei-Chiu Chuang added a comment - I am sorry I just noticed this now. This incompatible deserves a release note. Would anyone up for a short description? I would also recommend updating the DistCp documentation to note this change.
          Hide
          mithun Mithun Radhakrishnan added a comment -

          If -p option of distcp command is unspecified, block size is preserved.

          That looks good. What about:

          Block-size is preserved, even if the "-p" option of distcp command is unspecified.
          

          ?

          Show
          mithun Mithun Radhakrishnan added a comment - If -p option of distcp command is unspecified, block size is preserved. That looks good. What about: Block-size is preserved, even if the "-p" option of distcp command is unspecified. ?
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Hi Mithun,
          My understanding of the patch, is that if -p is not specified, block size is preserved; however, if say distcp runs with -pr (preserve replication factor), then block size is not preserved.
          What you proposed, sounds to be that -pb becomes a deprecated option because block size is always preserved, which I don't think is the purpose of this jira. Please correct me if I am wrong. Thanks.

          Show
          jojochuang Wei-Chiu Chuang added a comment - Hi Mithun, My understanding of the patch, is that if -p is not specified, block size is preserved; however, if say distcp runs with -pr (preserve replication factor), then block size is not preserved. What you proposed, sounds to be that -pb becomes a deprecated option because block size is always preserved, which I don't think is the purpose of this jira. Please correct me if I am wrong. Thanks.
          Hide
          mithun Mithun Radhakrishnan added a comment -

          What you proposed, sounds to be that -pb becomes a deprecated option because block size is always preserved.

          Ah, yes. I see. I stand corrected. :] Your phrasing is more accurate. Thank you.

          Show
          mithun Mithun Radhakrishnan added a comment - What you proposed, sounds to be that -pb becomes a deprecated option because block size is always preserved. Ah, yes. I see. I stand corrected. :] Your phrasing is more accurate. Thank you.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12041 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12041/)
          HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by (weichiu: rev 44350fdf495f5cf1bb15b1fe6f6e9587d3de0a59)

          • (edit) hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12041 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12041/ ) HADOOP-14557 . Document HADOOP-8143 (Change distcp to have -pb on by (weichiu: rev 44350fdf495f5cf1bb15b1fe6f6e9587d3de0a59) (edit) hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm

            People

            • Assignee:
              mithun Mithun Radhakrishnan
              Reporter:
              davet Dave Thompson
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development