Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10667

Report more accurate info about data corruption location

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: datanode, hdfs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Per

      https://issues.apache.org/jira/browse/HDFS-10587?focusedCommentId=15376897&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15376897

      129.77 report:

      2016-07-13 11:49:01,512 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving blk_1116167880_42906656 src: /10.6.134.229:43844 dest: /10.6.129.77:5080
      2016-07-13 11:49:01,543 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Checksum error in block blk_1116167880_42906656 from /10.6.134.229:43844
      org.apache.hadoop.fs.ChecksumException: Checksum error: DFSClient_NONMAPREDUCE_2019484565_1 at 81920 exp: 1352119728 got: -1012279895
              at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(Native Method)
              at org.apache.hadoop.util.NativeCrc32.verifyChunkedSumsByteArray(NativeCrc32.java:69)
              at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:347)
              at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:294)
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.verifyChunks(BlockReceiver.java:421)
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:558)
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:789)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:917)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:174)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:80)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
              at java.lang.Thread.run(Thread.java:745)
      2016-07-13 11:49:01,543 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for blk_1116167880_42906656
      java.io.IOException: Terminating due to a checksum error.java.io.IOException: Unexpected checksum mismatch while writing blk_1116167880_42906656 from /10.6.134.229:43844
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:571)
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:789)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:917)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:174)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:80)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
              at java.lang.Thread.run(Thread.java:745)
      

      and

      https://issues.apache.org/jira/browse/HDFS-10587?focusedCommentId=15378879&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15378879

      While verifying only packet, the position mentioned in the checksum exception, is relative to packet buffer offset, not the block offset. So 81920 is the offset in the exception.

      Create this jira to report more accurate corruption location information: the offset in the file, offset in block, and offset in packet.

      See

      https://issues.apache.org/jira/browse/HDFS-10587?focusedCommentId=15387083&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15387083

        Attachments

        1. HDFS-10667.001.patch
          2 kB
          Yuanbo Liu
        2. HDFS-10667.002.patch
          2 kB
          Yuanbo Liu
        3. HDFS-10667.003.patch
          2 kB
          Yuanbo Liu
        4. HDFS-10667.004.patch
          1 kB
          Yuanbo Liu
        5. HDFS-10667.005.patch
          1 kB
          Yuanbo Liu

          Issue Links

            Activity

              People

              • Assignee:
                yuanbo Yuanbo Liu
                Reporter:
                yzhangal Yongjun Zhang
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: