Hadoop Common
  1. Hadoop Common
  2. HADOOP-9209

Add shell command to dump file checksums

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.3-alpha, 3.0.0
    • Fix Version/s: 0.23.7, 2.1.0-beta
    • Component/s: fs, tools
    • Labels:
      None

      Description

      Occasionally while working with tools like distcp, or debugging certain issues, it's useful to be able to quickly see the checksum of a file. We currently have the APIs to efficiently calculate a checksum, but we don't expose it to users. This JIRA is to add a "fs -checksum" command which dumps the checksum information for the specified file(s).

      1. hadoop-9209.txt
        6 kB
        Todd Lipcon
      2. hadoop-9209.txt
        6 kB
        Todd Lipcon
      3. hadoop-9209.txt
        6 kB
        Kihwal Lee

        Issue Links

          Activity

          Hide
          Kihwal Lee added a comment -

          The failed QA Comment was from a mistaken patch that was uploaded to this JIRA.

          The pre-commit ran against a patch that had something extra, but the failure was not caused by the patch. The audit was about CHANGES.branch-trunk-win.txt not having the apache license header. Test failure is a known issue after the windows branch merge.

          Show
          Kihwal Lee added a comment - The failed QA Comment was from a mistaken patch that was uploaded to this JIRA. The pre-commit ran against a patch that had something extra, but the failure was not caused by the patch. The audit was about CHANGES.branch-trunk-win.txt not having the apache license header. Test failure is a known issue after the windows branch merge.
          Hide
          Jonathan Eagles added a comment -

          The failed QA Comment was from a mistaken patch that was uploaded to this JIRA.

          Show
          Jonathan Eagles added a comment - The failed QA Comment was from a mistaken patch that was uploaded to this JIRA.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1365 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1365/)
          HADOOP-9209. Add shell command to dump file checksums (Todd Lipcon via jeagles) (Revision 1453613)

          Result = SUCCESS
          jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453613
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1365 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1365/ ) HADOOP-9209 . Add shell command to dump file checksums (Todd Lipcon via jeagles) (Revision 1453613) Result = SUCCESS jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453613 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1337 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1337/)
          HADOOP-9209. Add shell command to dump file checksums (Todd Lipcon via jeagles) (Revision 1453613)

          Result = SUCCESS
          jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453613
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1337 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1337/ ) HADOOP-9209 . Add shell command to dump file checksums (Todd Lipcon via jeagles) (Revision 1453613) Result = SUCCESS jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453613 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #546 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/546/)
          HADOOP-9209. Add shell command to dump file checksums (Todd Lipcon via jeagles) (Revision 1453625)

          Result = UNSTABLE
          jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453625
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #546 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/546/ ) HADOOP-9209 . Add shell command to dump file checksums (Todd Lipcon via jeagles) (Revision 1453625) Result = UNSTABLE jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453625 Files : /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Yarn-trunk #148 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/148/)
          HADOOP-9209. Add shell command to dump file checksums (Todd Lipcon via jeagles) (Revision 1453613)

          Result = SUCCESS
          jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453613
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Yarn-trunk #148 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/148/ ) HADOOP-9209 . Add shell command to dump file checksums (Todd Lipcon via jeagles) (Revision 1453613) Result = SUCCESS jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453613 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12572423/hadoop-9209.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified test files.

          +1 tests included appear to have a timeout.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 1 release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.util.TestWinUtils

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2282//testReport/
          Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/2282//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2282//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572423/hadoop-9209.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified test files. +1 tests included appear to have a timeout. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit . The applied patch generated 1 release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.util.TestWinUtils +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2282//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/2282//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2282//console This message is automatically generated.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk-Commit #3427 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3427/)
          HADOOP-9209. Add shell command to dump file checksums (Todd Lipcon via jeagles) (Revision 1453613)

          Result = SUCCESS
          jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453613
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Show
          Hudson added a comment - Integrated in Hadoop-trunk-Commit #3427 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3427/ ) HADOOP-9209 . Add shell command to dump file checksums (Todd Lipcon via jeagles) (Revision 1453613) Result = SUCCESS jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453613 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Hide
          Jonathan Eagles added a comment -

          +1 I have verified the behavior and the tests in my environment. Thanks Kihwal and Todd!

          Show
          Jonathan Eagles added a comment - +1 I have verified the behavior and the tests in my environment. Thanks Kihwal and Todd!
          Hide
          Kihwal Lee added a comment -

          I just added a couple of lines of help message about what determines a checksum.

          Show
          Kihwal Lee added a comment - I just added a couple of lines of help message about what determines a checksum.
          Hide
          Kihwal Lee added a comment -

          The help message is minor. +1, as long as it says checksum is determined by the content AND other parameters.

          Show
          Kihwal Lee added a comment - The help message is minor. +1, as long as it says checksum is determined by the content AND other parameters.
          Hide
          Kihwal Lee added a comment -

          I hesitate to add implementation details, or refs to javadocs in the shell output. Would it suffice for the help output to say "This is an inefficient operation"?

          The intention is to let users know what to expect. The issue is not about its efficiency, but its behavior, which may surprise users who are use to various checksum/digest commands. We can probably briefly mention that the checksum depends on the block size in addition to the content.

          Show
          Kihwal Lee added a comment - I hesitate to add implementation details, or refs to javadocs in the shell output. Would it suffice for the help output to say "This is an inefficient operation"? The intention is to let users know what to expect. The issue is not about its efficiency, but its behavior, which may surprise users who are use to various checksum/digest commands. We can probably briefly mention that the checksum depends on the block size in addition to the content.
          Hide
          Daryn Sharp added a comment -

          I hesitate to add implementation details, or refs to javadocs in the shell output. Would it suffice for the help output to say "This is an inefficient operation"?

          Show
          Daryn Sharp added a comment - I hesitate to add implementation details, or refs to javadocs in the shell output. Would it suffice for the help output to say "This is an inefficient operation"?
          Hide
          Todd Lipcon added a comment -

          Where do you think these docs should go? I hesitate to duplicate them both in the shell -help output and in the getFileChecksum javadoc. You think it's OK to leave a pointer from the shell output to look at the javadoc for FileChecksum?

          Show
          Todd Lipcon added a comment - Where do you think these docs should go? I hesitate to duplicate them both in the shell -help output and in the getFileChecksum javadoc. You think it's OK to leave a pointer from the shell output to look at the javadoc for FileChecksum?
          Hide
          Kihwal Lee added a comment -

          Does that mesh with your understanding?

          Yes.

          The block size is a factor in determining crcPerBlock, which is a part of the algorithm name. But, when the file size is less than the block size, crcPerBlock will be 0 (as in the test cases in the patch). The only case it might confuse users is when two identical such files with different preferred block sizes. If the files get appended, as soon as the file gets bigger than the block size of one, the two checksums and algorithm name will look different.

          Show
          Kihwal Lee added a comment - Does that mesh with your understanding? Yes. The block size is a factor in determining crcPerBlock, which is a part of the algorithm name. But, when the file size is less than the block size, crcPerBlock will be 0 (as in the test cases in the patch). The only case it might confuse users is when two identical such files with different preferred block sizes. If the files get appended, as soon as the file gets bigger than the block size of one, the two checksums and algorithm name will look different.
          Hide
          Todd Lipcon added a comment -

          Yea... the issue is that the distinct properties are odd... here's a first crack at how I understand it:

          • If the checksum "algorithm names" are different, then we can say nothing about whether the files are identical. (does the "algorithm name" fully encompass things like the block size?)
          • If the checksum "algorithm names" are the same, and the checksums are the same, then the files are probably identical (except for possibilities of hash collision)
          • If the checksum "algorithm names" are the same, but the checksums differ, then the files are definitely not identical.

          Does that mesh with your understanding? Or does the block size not properly propagate into the algorithm name string? (and if that's the case, then under what cases can we actually make definitive judgments?)

          Show
          Todd Lipcon added a comment - Yea... the issue is that the distinct properties are odd... here's a first crack at how I understand it: If the checksum "algorithm names" are different, then we can say nothing about whether the files are identical. (does the "algorithm name" fully encompass things like the block size?) If the checksum "algorithm names" are the same, and the checksums are the same, then the files are probably identical (except for possibilities of hash collision) If the checksum "algorithm names" are the same, but the checksums differ, then the files are definitely not identical. Does that mesh with your understanding? Or does the block size not properly propagate into the algorithm name string? (and if that's the case, then under what cases can we actually make definitive judgments?)
          Hide
          Kihwal Lee added a comment -

          To play devil's advocate, though, we do expose FileSystem.getFileChecksum() as a public API, so it seems like offering CLI access to the same API is equivalent.

          CLI access through FsShell sounds reasonable, as long as the distinct properties of the HDFS file checksum is properly documented.

          Show
          Kihwal Lee added a comment - To play devil's advocate, though, we do expose FileSystem.getFileChecksum() as a public API, so it seems like offering CLI access to the same API is equivalent. CLI access through FsShell sounds reasonable, as long as the distinct properties of the HDFS file checksum is properly documented.
          Hide
          Todd Lipcon added a comment -

          Kihwal Lee, that's a good point. Maybe it's better to not include this as a shell command, but instead just have it be an undocumented 'tool' accessible by something like 'hadoop org.apache.hadoop.tools.ChecksumFile' or something? Putting it in the Shell hierarchy is nice because we get argument parsing for free, etc, but maybe it's unnecessary.

          To play devil's advocate, though, we do expose FileSystem.getFileChecksum() as a public API, so it seems like offering CLI access to the same API is equivalent.

          Show
          Todd Lipcon added a comment - Kihwal Lee , that's a good point. Maybe it's better to not include this as a shell command, but instead just have it be an undocumented 'tool' accessible by something like 'hadoop org.apache.hadoop.tools.ChecksumFile' or something? Putting it in the Shell hierarchy is nice because we get argument parsing for free, etc, but maybe it's unnecessary. To play devil's advocate, though, we do expose FileSystem.getFileChecksum() as a public API, so it seems like offering CLI access to the same API is equivalent.
          Hide
          Kihwal Lee added a comment -

          Regarding the name of command, it seems we use the same name if there is something equivalent in shell, otherwise the name is more descriptive. Commands like sum and md5sum exist, so "checksum" may be okay in that sense. But more descriptive name will be fine too.

          HDFS checksum is a bit different from regular checksums obtained against a file in conventional file systems. It has been of no concern until now as it's mostly internal. But if it is exposed to users, we now have to tell users what it is and what to expect. For example, users must be told that hdfs file checksum can be different even if the contents of files are identical due to use of different block sizes and checksum parameters. May be we should mention it in the help.

          Show
          Kihwal Lee added a comment - Regarding the name of command, it seems we use the same name if there is something equivalent in shell, otherwise the name is more descriptive. Commands like sum and md5sum exist, so "checksum" may be okay in that sense. But more descriptive name will be fine too. HDFS checksum is a bit different from regular checksums obtained against a file in conventional file systems. It has been of no concern until now as it's mostly internal. But if it is exposed to users, we now have to tell users what it is and what to expect. For example, users must be told that hdfs file checksum can be different even if the contents of files are identical due to use of different block sizes and checksum parameters. May be we should mention it in the help.
          Hide
          Colin Patrick McCabe added a comment -
          +      "to the datanode storing each block of the file, and thus is not\n" +
          

          Perhaps this should be "a datanode storing..." to avoid the implication that there is only one place a block is stored.

          I think it would be better to call this command -dumpChecksums. Just calling it "checksum" leaves it kind of ambiguous what it does (at least in my mind). A command just called "checksum" could do many things-- like create a new checksum for a file that didn't have one, checksum some data which wasn't checksummed before, etc. "dump checksum" makes it clear that you're dumping something that already exists.

          Show
          Colin Patrick McCabe added a comment - + "to the datanode storing each block of the file, and thus is not\n" + Perhaps this should be " a datanode storing..." to avoid the implication that there is only one place a block is stored. I think it would be better to call this command -dumpChecksums . Just calling it "checksum" leaves it kind of ambiguous what it does (at least in my mind). A command just called "checksum" could do many things-- like create a new checksum for a file that didn't have one, checksum some data which wasn't checksummed before, etc. "dump checksum" makes it clear that you're dumping something that already exists.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564876/hadoop-9209.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 2 release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2044//testReport/
          Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/2044//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2044//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564876/hadoop-9209.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit . The applied patch generated 2 release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2044//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/2044//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2044//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Oops, had a bad comparator in the TestCLI config. New patch just fixes the test.

          Show
          Todd Lipcon added a comment - Oops, had a bad comparator in the TestCLI config. New patch just fixes the test.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564852/hadoop-9209.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 2 release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.cli.TestCLI

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2043//testReport/
          Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/2043//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2043//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564852/hadoop-9209.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit . The applied patch generated 2 release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.cli.TestCLI +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2043//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/2043//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2043//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Attached patch implements the new shell command.

          In addition to the unit test, I tested manually:

          $ ./bin/hadoop fs -checksum '/*'
          /file1  MD5-of-0MD5-of-512CRC32C        000002000000000000000000b234aa05a75fed38536bda657b20bfcf
          /file1-crc32        MD5-of-0MD5-of-512CRC32 000002000000000000000000593b23e67a7477aab90e42e41478b321
          /file1-crc32-copy   MD5-of-0MD5-of-512CRC32 000002000000000000000000593b23e67a7477aab90e42e41478b321
          
          $ ./bin/hadoop fs -help checksum
          -checksum <src> ...:    Dump checksum information for files that match the file
                          pattern <src> to stdout. Note that this requires a round-trip
                          to the datanode storing each block of the file, and thus is not
                          efficient to run on a large number of files.
          
          Show
          Todd Lipcon added a comment - Attached patch implements the new shell command. In addition to the unit test, I tested manually: $ ./bin/hadoop fs -checksum '/*' /file1 MD5-of-0MD5-of-512CRC32C 000002000000000000000000b234aa05a75fed38536bda657b20bfcf /file1-crc32 MD5-of-0MD5-of-512CRC32 000002000000000000000000593b23e67a7477aab90e42e41478b321 /file1-crc32-copy MD5-of-0MD5-of-512CRC32 000002000000000000000000593b23e67a7477aab90e42e41478b321 $ ./bin/hadoop fs -help checksum -checksum <src> ...: Dump checksum information for files that match the file pattern <src> to stdout. Note that this requires a round-trip to the datanode storing each block of the file, and thus is not efficient to run on a large number of files.

            People

            • Assignee:
              Todd Lipcon
              Reporter:
              Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development