Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-8233

Turn CRC checking off for 0 byte size and differing blocksizes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.23.3
    • None
    • tools/distcp
    • None

    Description

      DistcpV2 (hadoop-tools/hadoop-distcp/..) can fail from checksum failure, sometimes when copying a 0 byte file. Root cause of this may have to do with an inconsistent nature of HDFS when creating 0 byte files, however distcp can avoid this issue by not checking CRC when size is zero.

      Further, distcp fails checksum when copying from two clusters that use different blocksizes. In this case it does not make sense to check CRC, as it is a guaranteed failure.

      We need to turn CRC checking off for the above two cases.

      Attachments

        1. HADOOP-8233-branch-0.23.2.patch
          2 kB
          Dave Thompson

        Issue Links

          Activity

            People

              davet Dave Thompson
              davet Dave Thompson
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: