Hadoop Common
  1. Hadoop Common
  2. HADOOP-8239

Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.23.3, 2.0.2-alpha
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended to carry the information on the actual checksum type being used. The interoperability between the extended version and branch-1 should be guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or httpfs.

      1. hadoop-8239-branch-0.23.patch.txt
        11 kB
        Kihwal Lee
      2. hadoop-8239-trunk-branch2.patch.txt
        11 kB
        Kihwal Lee
      3. hadoop-8239-trunk-branch2.patch.txt
        8 kB
        Kihwal Lee
      4. hadoop-8239-trunk-branch2.patch.txt
        11 kB
        Kihwal Lee
      5. hadoop-8239-before-hadoop-8240.patch.txt
        9 kB
        Kihwal Lee
      6. hadoop-8239-after-hadoop-8240.patch.txt
        9 kB
        Kihwal Lee
      7. hadoop-8239-before-hadoop-8240.patch.txt
        6 kB
        Kihwal Lee
      8. hadoop-8239-after-hadoop-8240.patch.txt
        6 kB
        Kihwal Lee

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #351 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/351/)
          HADOOP-8239. Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. (Kihwal Lee via szetszwo) (Revision 1375834)

          Result = SUCCESS
          tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375834
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #351 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/351/ ) HADOOP-8239 . Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. (Kihwal Lee via szetszwo) (Revision 1375834) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375834 Files : /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Hide
          Thomas Graves added a comment -

          I pulled this into 0.23.3

          Show
          Thomas Graves added a comment - I pulled this into 0.23.3
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1173 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1173/)
          HADOOP-8239. Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. Contributed by Kihwal Lee (Revision 1375450)

          Result = FAILURE
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375450
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1173 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1173/ ) HADOOP-8239 . Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. Contributed by Kihwal Lee (Revision 1375450) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375450 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1141 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1141/)
          HADOOP-8239. Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. Contributed by Kihwal Lee (Revision 1375450)

          Result = FAILURE
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375450
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1141 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1141/ ) HADOOP-8239 . Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. Contributed by Kihwal Lee (Revision 1375450) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375450 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Hide
          Kihwal Lee added a comment -

          Attaching the patch for branch-0.23.

          Show
          Kihwal Lee added a comment - Attaching the patch for branch-0.23.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #2640 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2640/)
          HADOOP-8239. Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. Contributed by Kihwal Lee (Revision 1375450)

          Result = FAILURE
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375450
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2640 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2640/ ) HADOOP-8239 . Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. Contributed by Kihwal Lee (Revision 1375450) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375450 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #2611 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2611/)
          HADOOP-8239. Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. Contributed by Kihwal Lee (Revision 1375450)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375450
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2611 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2611/ ) HADOOP-8239 . Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. Contributed by Kihwal Lee (Revision 1375450) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375450 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #2675 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2675/)
          HADOOP-8239. Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. Contributed by Kihwal Lee (Revision 1375450)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375450
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2675 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2675/ ) HADOOP-8239 . Add subclasses of MD5MD5CRC32FileChecksum to support file checksum with CRC32C. Contributed by Kihwal Lee (Revision 1375450) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375450 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32CastagnoliFileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32GzipFileChecksum.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks, Kihwal!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Kihwal!
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12541715/hadoop-8239-trunk-branch2.patch.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.TestClientReportBadBlock

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1338//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1338//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541715/hadoop-8239-trunk-branch2.patch.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestClientReportBadBlock +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1338//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1338//console This message is automatically generated.
          Hide
          Kihwal Lee added a comment -

          MD5MD5CRC32GzipFileChecksum and MD5MD5CRC32CastagnoliFileChecksum should not have the following fields.

          The last patch is supposed to fix this, but the files were not added. Sorry about that.

          DataChecksum.MIXED is not used. What do we need it? Could we add it later?

          Any file system implementation that's using MD5MD5CRC32FileChecksum will need it, a file can contain blocks with different checksum types. This is not desired, but at least we should be able to detect it. So I think it belongs here and will be used by HDFS-3177.

          I will post the corrected patch.

          Show
          Kihwal Lee added a comment - MD5MD5CRC32GzipFileChecksum and MD5MD5CRC32CastagnoliFileChecksum should not have the following fields. The last patch is supposed to fix this, but the files were not added. Sorry about that. DataChecksum.MIXED is not used. What do we need it? Could we add it later? Any file system implementation that's using MD5MD5CRC32FileChecksum will need it, a file can contain blocks with different checksum types. This is not desired, but at least we should be able to detect it. So I think it belongs here and will be used by HDFS-3177 . I will post the corrected patch.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Kihwal,

          Forgot to add the new files? The new classes are not found in the patch.

          The previous Jenkins build did not test your patch. It seems that it failed to download the patch file. The patch file shown in Build Artifacts had zero byte. However, Jenkins still ran the build as usually. I filed HADOOP-8714 for this problem.

          I looked at your previous two patches. Some comments:

          • DataChecksum.MIXED is not used. What do we need it? Could we add it later?
          • MD5MD5CRC32GzipFileChecksum and MD5MD5CRC32CastagnoliFileChecksum should not have the following fields.
            +  private int bytesPerCRC;
            +  private long crcPerBlock;
            +  private MD5Hash md5;
            

            They should use the fields in the super class. Also the constructors should call the super class constructors.

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Kihwal, Forgot to add the new files? The new classes are not found in the patch. The previous Jenkins build did not test your patch. It seems that it failed to download the patch file. The patch file shown in Build Artifacts had zero byte. However, Jenkins still ran the build as usually. I filed HADOOP-8714 for this problem. I looked at your previous two patches. Some comments: DataChecksum.MIXED is not used. What do we need it? Could we add it later? MD5MD5CRC32GzipFileChecksum and MD5MD5CRC32CastagnoliFileChecksum should not have the following fields. + private int bytesPerCRC; + private long crcPerBlock; + private MD5Hash md5; They should use the fields in the super class. Also the constructors should call the super class constructors.
          Hide
          Kihwal Lee added a comment -

          -1 tests included. The patch doesn't appear to include any new or modified tests.

          Additional test cases will be in HDFS-3177.

          Show
          Kihwal Lee added a comment - -1 tests included. The patch doesn't appear to include any new or modified tests. Additional test cases will be in HDFS-3177 .
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12541651/hadoop-8239-trunk-branch2.patch.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1332//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1332//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541651/hadoop-8239-trunk-branch2.patch.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1332//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1332//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12541629/hadoop-8239-trunk-branch2.patch.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.ha.TestZKFailoverController
          org.apache.hadoop.hdfs.TestDataTransferKeepalive
          org.apache.hadoop.hdfs.TestDistributedFileSystem

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1329//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/1329//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1329//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541629/hadoop-8239-trunk-branch2.patch.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ha.TestZKFailoverController org.apache.hadoop.hdfs.TestDataTransferKeepalive org.apache.hadoop.hdfs.TestDistributedFileSystem +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1329//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/1329//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1329//console This message is automatically generated.
          Hide
          Kihwal Lee added a comment -

          BAD patch. I will fix it and reupload in a bit.

          Show
          Kihwal Lee added a comment - BAD patch. I will fix it and reupload in a bit.
          Hide
          Kihwal Lee added a comment -

          The new patch adds a separate class for each checksum type used in MD5MD5CRC32FileChecksum.

          MD5MD5CRC32FileChecksum has the new getCrcType() and the subclasses overrides it.

          Show
          Kihwal Lee added a comment - The new patch adds a separate class for each checksum type used in MD5MD5CRC32FileChecksum. MD5MD5CRC32FileChecksum has the new getCrcType() and the subclasses overrides it.
          Hide
          Kihwal Lee added a comment -

          Correction: MD5MD5CRC32FileChecksum#getCrcType() is not needed.

          Show
          Kihwal Lee added a comment - Correction: MD5MD5CRC32FileChecksum#getCrcType() is not needed.
          Hide
          Kihwal Lee added a comment -

          I think adding a new class is a good idea. Since DFS.getFileChcksum is expected to return MD5MD5CRC32FileChecksum in a lot of places, subclassing MD5MD5CRC32FileChecksum for each variant could work.

          We can regard "CRC32" in MD5MD5CRC32FileChecksum as a generic term for any 32 bit CRC algorithms. At least that is the case in current 2.0/trunk. If we go with this, subclassing MD5MD5CRC32FileChecksum for each variant makes sense.

          The following is what I am thinking:

          In MD5MD5CRC32FileChecksum

          The constructor sets crcType to DataChecksum.Type.CRC32

          /** 
           * getAlgorithmName() will use it to construct the name
           */ 
          private DataChecksum.Type getCrcType() {
            return crcType;
          }
          
          public ChecksumOpt getChecksumOpt() {
            rethrn new ChecksumOpt(getCrcType(), bytesPerCrc);
          }
          

          Subclass MD5MD5CRC32GzipFileChecksum
          The constructor sets crcType to DataChecksum.Type.CRC32

          Subclass MD5MD5CRC32CastagnoliFileChecksum
          The constructor sets crcType to DataChecksum.Type.CRC32C

          Interoperability & compatibility

          • Any existing user/hadoop code that expects MD5MD5CRC32FileChecksum from DFS.getFileChecksum() will continue to work.
          • Any new code that makes use of the new getChecksumOpt() will work as long as DFSClient#getFileChecksum() creates and returns the right object. This will be done in HDFS-3177, and without it, every thing will default to CRC32, which is the current behavior of branch-2/trunk.
          • A newer client calling getFileChecksum() to an old cluster over hftp or webhdfs will work. (always CRC32)
          • An older client calling getFileChecksum() to newer cluster - If the remote file on the newer cluster is in CRC32, both hftp and webhdfs work. If CRC32C or anything else, hftp will have a cheksum mismatch. In webhdfs, it will get an algorithm field that won't match anything the old MD5MD5CRC32FileChecksum can create. In WebHdfsFileSystem, it will generate an IOException, "Algorithm not matched:....".

          I think this is reasonable. What do you think?

          Show
          Kihwal Lee added a comment - I think adding a new class is a good idea. Since DFS.getFileChcksum is expected to return MD5MD5CRC32FileChecksum in a lot of places, subclassing MD5MD5CRC32FileChecksum for each variant could work. We can regard "CRC32" in MD5MD5CRC32FileChecksum as a generic term for any 32 bit CRC algorithms. At least that is the case in current 2.0/trunk. If we go with this, subclassing MD5MD5CRC32FileChecksum for each variant makes sense. The following is what I am thinking: In MD5MD5CRC32FileChecksum The constructor sets crcType to DataChecksum.Type.CRC32 /** * getAlgorithmName() will use it to construct the name */ private DataChecksum.Type getCrcType() { return crcType; } public ChecksumOpt getChecksumOpt() { rethrn new ChecksumOpt(getCrcType(), bytesPerCrc); } Subclass MD5MD5CRC32GzipFileChecksum The constructor sets crcType to DataChecksum.Type.CRC32 Subclass MD5MD5CRC32CastagnoliFileChecksum The constructor sets crcType to DataChecksum.Type.CRC32C Interoperability & compatibility Any existing user/hadoop code that expects MD5MD5CRC32FileChecksum from DFS.getFileChecksum() will continue to work. Any new code that makes use of the new getChecksumOpt() will work as long as DFSClient#getFileChecksum() creates and returns the right object. This will be done in HDFS-3177 , and without it, every thing will default to CRC32, which is the current behavior of branch-2/trunk. A newer client calling getFileChecksum() to an old cluster over hftp or webhdfs will work. (always CRC32) An older client calling getFileChecksum() to newer cluster - If the remote file on the newer cluster is in CRC32, both hftp and webhdfs work. If CRC32C or anything else, hftp will have a cheksum mismatch. In webhdfs, it will get an algorithm field that won't match anything the old MD5MD5CRC32FileChecksum can create. In WebHdfsFileSystem, it will generate an IOException, "Algorithm not matched:....". I think this is reasonable. What do you think?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > One approach I can think of is to leave the current readFields()/write() methods unchanged. I think only WebHdfs is using it and if that is true, we can make WebHdfs actually send and receive everything in JSON format and keep the current "bytes" Json field as is.

          FileChecksum is designed to support different kinds of checksum algorithms so that it has the following abstract methods

          public abstract String getAlgorithmName();
          public abstract int getLength();
          public abstract byte[] getBytes();
          

          WebHDFS FileChecksum schema has fields corresponding to these methods.

          With FileChecksum, clients like WebHDFS could obtain the corresponding checksum by first getting the checksum algorithm name and then computing the bytes. If we add MD5MD5CRC32FileChecksum specific fields to the JSON format, then it is harder to support other algorithms and harder to specify the WebHDFS API since we have to specify the cases for each algorithm in the API.

          For our tasks here, we are actually adding new algorithms as we have to change the algorithm name for different CRC types. So, we may as well add new classes to handle them instead of changing MD5MD5CRC32FileChecksum. BTW, the name "MD5MD5CRC32FileChecksum" is not suitable for the other crc type because it has "CRC32". Thought?

          Show
          Tsz Wo Nicholas Sze added a comment - > One approach I can think of is to leave the current readFields()/write() methods unchanged. I think only WebHdfs is using it and if that is true, we can make WebHdfs actually send and receive everything in JSON format and keep the current "bytes" Json field as is. FileChecksum is designed to support different kinds of checksum algorithms so that it has the following abstract methods public abstract String getAlgorithmName(); public abstract int getLength(); public abstract byte [] getBytes(); WebHDFS FileChecksum schema has fields corresponding to these methods. With FileChecksum, clients like WebHDFS could obtain the corresponding checksum by first getting the checksum algorithm name and then computing the bytes. If we add MD5MD5CRC32FileChecksum specific fields to the JSON format, then it is harder to support other algorithms and harder to specify the WebHDFS API since we have to specify the cases for each algorithm in the API. For our tasks here, we are actually adding new algorithms as we have to change the algorithm name for different CRC types. So, we may as well add new classes to handle them instead of changing MD5MD5CRC32FileChecksum. BTW, the name "MD5MD5CRC32FileChecksum" is not suitable for the other crc type because it has "CRC32". Thought?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12541504/hadoop-8239-before-hadoop-8240.patch.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1328//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1328//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541504/hadoop-8239-before-hadoop-8240.patch.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1328//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1328//console This message is automatically generated.
          Hide
          Kihwal Lee added a comment -

          Just posted a new patch that makes getFileChecksum() interoperable.

          Show
          Kihwal Lee added a comment - Just posted a new patch that makes getFileChecksum() interoperable.
          Hide
          Kihwal Lee added a comment -

          It may be better to move the JSON util methods to MD5MD5CRC32FileChecksum.java, since they will have to know the internals of MD5MD5CRC32FileChecksum.

          On a second thought, the JsonUtil is in the o.h.a.hdfs.web package, so I will leave it in there.

          Show
          Kihwal Lee added a comment - It may be better to move the JSON util methods to MD5MD5CRC32FileChecksum.java, since they will have to know the internals of MD5MD5CRC32FileChecksum. On a second thought, the JsonUtil is in the o.h.a.hdfs.web package, so I will leave it in there.
          Hide
          Kihwal Lee added a comment -

          I think XML is fine. XML parsing is done at the document level, so we can safely find out or ignore the existence of the extra parameter and not worry about the size of data. I tried calling getFileChecksum() over Hftp between a patched 0.23 cluster and a 1.0.x cluster, and it worked fine both ways.

          The change you suggested does not solve the whole problem. The magic number is like a simple binary length field. Presence/absence of it tells you how much data you need to read. So the read-side of patched version works even when reading from an unpatched version. But it's not true for the other way around. The unpatched version will always leave something unread in the stream. XML is nice in that it inherently has begin and end marker and not sensitive to size changes.

          Since JsonUtil depends on this serialization/deserialization methods I don't think it cannot obtain the bidirectional compatibility by modifying only one side. If it had used XML and did not do the length check, it would have no such problem. Fully Json-ized approach could have worked as well.

          One approach I can think of is to leave the current readFields()/write() methods unchanged. I think only WebHdfs is using it and if that is true, we can make WebHdfs actually send and receive everything in JSON format and keep the current "bytes" Json field as is. When it does not find the "new" fields from an old data source, it can do the old deserialization on "bytes". Similarly, it should send everything in individual JSON field as well as the old serialzed "bytes".

          It may be better to move the JSON util methods to MD5MD5CRC32FileChecksum.java, since they will have to know the internals of MD5MD5CRC32FileChecksum.

          Show
          Kihwal Lee added a comment - I think XML is fine. XML parsing is done at the document level, so we can safely find out or ignore the existence of the extra parameter and not worry about the size of data. I tried calling getFileChecksum() over Hftp between a patched 0.23 cluster and a 1.0.x cluster, and it worked fine both ways. The change you suggested does not solve the whole problem. The magic number is like a simple binary length field. Presence/absence of it tells you how much data you need to read. So the read-side of patched version works even when reading from an unpatched version. But it's not true for the other way around. The unpatched version will always leave something unread in the stream. XML is nice in that it inherently has begin and end marker and not sensitive to size changes. Since JsonUtil depends on this serialization/deserialization methods I don't think it cannot obtain the bidirectional compatibility by modifying only one side. If it had used XML and did not do the length check, it would have no such problem. Fully Json-ized approach could have worked as well. One approach I can think of is to leave the current readFields()/write() methods unchanged. I think only WebHdfs is using it and if that is true, we can make WebHdfs actually send and receive everything in JSON format and keep the current "bytes" Json field as is. When it does not find the "new" fields from an old data source, it can do the old deserialization on "bytes". Similarly, it should send everything in individual JSON field as well as the old serialzed "bytes". It may be better to move the JSON util methods to MD5MD5CRC32FileChecksum.java, since they will have to know the internals of MD5MD5CRC32FileChecksum.
          Hide
          Tsz Wo Nicholas Sze added a comment -
          • We cannot use EOFException in readFields(..) since it won't work if the serialized bytes is not at the end of the stream. How about we add a magic number in the new serialization so that it can distinguish the new and old serializations? For example,
            //MD5MD5CRC32FileChecksum.java
              final int MAGIC = 0xABCDEF12;
            
              public void readFields(DataInput in) throws IOException {
                final int firstInt = in.readInt();
                if (firstInt != MAGIC) {
                  //old serialization: crcType is CRC32, firstInt is bytesPerCRC 
                  crcType = DataChecksum.Type.CRC32;
                  bytesPerCRC = firstInt;
                  crcPerBlock = in.readLong();
                  md5 = MD5Hash.read(in);
                } else {
                  //new serialization includes crcType
                  crcType = DataChecksum.Type.valueOf(in.readInt());
                  bytesPerCRC = in.readInt();
                  crcPerBlock = in.readLong();
                  md5 = MD5Hash.read(in);
                }
              }
            
              public void write(DataOutput out) throws IOException {
                if (crcType != DataChecksum.Type.CRC32) {
                  //for non-CRC32 type, write magic and crcType
                  out.writeInt(MAGIC);
                  out.writeInt(crcType.id);
                }
                out.writeInt(bytesPerCRC);
                out.writeLong(crcPerBlock);
                md5.write(out);
              }
            

            If we decide to do it, we should do the same for XML and JSON.

          • Need to update org.apache.hadoop.hdfs.web.JsonUtil. I guess you may already have plan to do it separately.
          Show
          Tsz Wo Nicholas Sze added a comment - We cannot use EOFException in readFields(..) since it won't work if the serialized bytes is not at the end of the stream. How about we add a magic number in the new serialization so that it can distinguish the new and old serializations? For example, //MD5MD5CRC32FileChecksum.java final int MAGIC = 0xABCDEF12; public void readFields(DataInput in) throws IOException { final int firstInt = in.readInt(); if (firstInt != MAGIC) { //old serialization: crcType is CRC32, firstInt is bytesPerCRC crcType = DataChecksum.Type.CRC32; bytesPerCRC = firstInt; crcPerBlock = in.readLong(); md5 = MD5Hash.read(in); } else { // new serialization includes crcType crcType = DataChecksum.Type.valueOf(in.readInt()); bytesPerCRC = in.readInt(); crcPerBlock = in.readLong(); md5 = MD5Hash.read(in); } } public void write(DataOutput out) throws IOException { if (crcType != DataChecksum.Type.CRC32) { // for non-CRC32 type, write magic and crcType out.writeInt(MAGIC); out.writeInt(crcType.id); } out.writeInt(bytesPerCRC); out.writeLong(crcPerBlock); md5.write(out); } If we decide to do it, we should do the same for XML and JSON. Need to update org.apache.hadoop.hdfs.web.JsonUtil. I guess you may already have plan to do it separately.
          Hide
          Kihwal Lee added a comment -

          -1 tests included. The patch doesn't appear to include any new or modified tests.

          No new test is added in this patch. A part of the compatibility is checked by existing tests excercising getFileChecksum(). More tests are coming with HDFS:3177.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common:
          org.apache.hadoop.ha.TestZKFailoverController

          This is a known issue. See HADOOP-8591.

          Show
          Kihwal Lee added a comment - -1 tests included. The patch doesn't appear to include any new or modified tests. No new test is added in this patch. A part of the compatibility is checked by existing tests excercising getFileChecksum(). More tests are coming with HDFS:3177. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.ha.TestZKFailoverController This is a known issue. See HADOOP-8591 .
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12541210/hadoop-8239-before-hadoop-8240.patch.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common:

          org.apache.hadoop.ha.TestZKFailoverController

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1317//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1317//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541210/hadoop-8239-before-hadoop-8240.patch.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.ha.TestZKFailoverController +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1317//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1317//console This message is automatically generated.
          Hide
          Kihwal Lee added a comment -

          The attached patch updates MD5MD5CRC32 to handle the checksum type. The change is backward compatible. E.g. getFileChecksum() calls between 1.x and 2.x custers work over hftp or webhdfs. The actual implementation of the updated call is in HDFS-3177.

          Show
          Kihwal Lee added a comment - The attached patch updates MD5MD5CRC32 to handle the checksum type. The change is backward compatible. E.g. getFileChecksum() calls between 1.x and 2.x custers work over hftp or webhdfs. The actual implementation of the updated call is in HDFS-3177 .
          Hide
          Rohini Palaniswamy added a comment -

          When you fix this jira, can you add public getter methods for getting
          bytesPerCRC, crcPerBlock, checksum and checksumType in MD5MD5CRC32FileChecksum?

          We have internal tool/library that mimics the computation of the
          hdfs checksum for a file downloaded to local filesystems (NFS/Linux filesystem).
          This is for customers to validate the checksum of the file after they downloaded it to
          other filesystems from hdfs via http(hdfsproxy). Since those fields were not accessible, wrote a wrapper that
          parses the MD5MD5CRC32FileChecksum bytes on its own similar to JsonUtil
          (HDFS-3176). If there were public getter methods available it would help get
          rid of the wrapper class.

          Show
          Rohini Palaniswamy added a comment - When you fix this jira, can you add public getter methods for getting bytesPerCRC, crcPerBlock, checksum and checksumType in MD5MD5CRC32FileChecksum? We have internal tool/library that mimics the computation of the hdfs checksum for a file downloaded to local filesystems (NFS/Linux filesystem). This is for customers to validate the checksum of the file after they downloaded it to other filesystems from hdfs via http(hdfsproxy). Since those fields were not accessible, wrote a wrapper that parses the MD5MD5CRC32FileChecksum bytes on its own similar to JsonUtil ( HDFS-3176 ). If there were public getter methods available it would help get rid of the wrapper class.
          Hide
          Kihwal Lee added a comment -

          HDFS-3176 must be fixed first. Otherwise, webhdfs will break with this patch.

          Show
          Kihwal Lee added a comment - HDFS-3176 must be fixed first. Otherwise, webhdfs will break with this patch.

            People

            • Assignee:
              Kihwal Lee
              Reporter:
              Kihwal Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development