Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16049

DistCp result has data and checksum mismatch when blocks per chunk > 0

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.9.2
    • Fix Version/s: 2.9.3
    • Component/s: tools/distcp
    • Labels:
      None

      Description

      In 2.9.2 RetriableFileCopyCommand.copyBytes,

      int bytesRead = readBytes(inStream, buf, sourceOffset);
      while (bytesRead >= 0) {
        ...
        if (action == FileAction.APPEND) {
          sourceOffset += bytesRead;
        }
        ... // write to dst
        bytesRead = readBytes(inStream, buf, sourceOffset);
      }

      it does a positioned read but the position (`sourceOffset` here) is never updated when blocks per chunk is set to > 0 (which always disables append action). So for chunk with offset != 0, it will keep copying the first few bytes again and again, causing result to have data & checksum mismatch.

      To re-produce this issue, in branch-2, update BLOCK_SIZE to 10240 (> default copy buffer size) in class TestDistCpSystem and run it.

      HADOOP-15292 has resolved the issue reported in this ticket in trunk/branch-3.1/branch-3.2 by not using the positioned read, but has not been backported to branch-2 yet

       

       

        Attachments

        1. HADOOP-16049-branch-2-003.patch
          8 kB
          Kai Xie
        2. HADOOP-16049-branch-2-004.patch
          21 kB
          Kai Xie
        3. HADOOP-16049-branch-2-003.patch
          21 kB
          Kai Xie
        4. HADOOP-16049-branch-2-005.patch
          21 kB
          Kai Xie

          Issue Links

            Activity

              People

              • Assignee:
                kai33 Kai Xie
                Reporter:
                kai33 Kai Xie
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: