Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8224

Schedule a block for scanning if its metadata file is corrupt

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.8.0, 2.7.4, 3.0.0-alpha1
    • Component/s: datanode
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      This happened in our 2.6 cluster.
      One of the block and its metadata file were corrupted.
      The disk was healthy in this case.
      Only the block was corrupt.

      Namenode tried to copy that block to another datanode but failed with the following stack trace:

      2015-04-20 01:04:04,421 [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN datanode.DataNode: DatanodeRegistration(a.b.c.d, datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, infoSecurePort=0, ipcPort=8020, storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to a1.b1.c1.d1:1004 got
      java.io.IOException: Could not create DataChecksum of type 0 with bytesPerChecksum 0
      at org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125)
      at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175)
      at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140)
      at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102)
      at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:287)
      at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989)
      at java.lang.Thread.run(Thread.java:722)

      The following catch block in DataTransfer#run method will treat every IOException as disk error fault and run disk errror

      catch (IOException ie) {
              LOG.warn(bpReg + ":Failed to transfer " + b + " to " +
                  targets[0] + " got ", ie);
              // check if there are any disk problem
              checkDiskErrorAsync();
            } 
      

      This block was never scanned by BlockPoolSliceScanner otherwise it would have reported as corrupt block.

        Attachments

        1. HDFS-8224-trunk-3.patch
          9 kB
          Rushabh S Shah
        2. HDFS-8224-trunk-2.patch
          11 kB
          Rushabh S Shah
        3. HDFS-8224-trunk-1.patch
          9 kB
          Rushabh S Shah
        4. HDFS-8224-trunk.patch
          11 kB
          Rushabh S Shah
        5. HDFS-8224-branch-2.patch
          9 kB
          Rushabh S Shah
        6. HDFS-8224-branch-2.7.patch
          9 kB
          Wei-Chiu Chuang

          Activity

            People

            • Assignee:
              shahrs87 Rushabh S Shah
              Reporter:
              shahrs87 Rushabh S Shah
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: