Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-2359

NPE found in Datanode log while Disk failed during different HDFS operation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.20.205.0
    • 0.20.205.0
    • datanode
    • None
    • Reviewed

    Description

      Scenario:
      I have a cluster of 4 DN ,each of them have 12disks.

      In hdfs-site.xml I have "dfs.datanode.failed.volumes.tolerated=3"

      During the execution of distcp (hdfs->hdfs), I am failing 3 disks in one Datanode, by making Data Directory permission 000, The distcp job is successful but , I am getting some NullPointerException in Datanode log

      In one thread
      $hadoop distcp /user/$HADOOPQA_USER/data1 /user/$HADOOPQA_USER/data3

      In another thread in a datanode
      $ chmod 000 /xyz/

      {0,1,2}

      /hadoop/var/hdfs/data

      where [ dfs.data.dir is set as /xyz/

      {0..11}

      /hadoop/var/hdfs/data ]

      Log Snippet from the Datanode
      =============

      2011-09-19 12:43:40,314 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block
      blk_7065198814142552283_62557. BlockInfo not found in volumeMap.
      2011-09-19 12:43:40,314 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block
      blk_7066946313092770579_39189. BlockInfo not found in volumeMap.
      2011-09-19 12:43:40,314 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block
      blk_7070305189404753930_49359. BlockInfo not found in volumeMap.
      2011-09-19 12:43:40,327 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Error processing datanode Command
      java.io.IOException: Error in deleting blocks.
      at org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:1820)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.processCommand(DataNode.java:1074)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.processCommand(DataNode.java:1036)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:891)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1419)
      at java.lang.Thread.run(Thread.java:619)
      2011-09-19 12:43:41,304 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
      DatanodeRegistration(xx.xxx.xxx.xxx:xxxx, storageID=xx-xxxxxxxxxxxx-xx.xxx.xxx.xxx-xxxx-xxxxxxxxxxx, infoPort=1006,
      ipcPort=8020):DataXceiver
      java.lang.NullPointerException
      at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner$LogFileHandler.appendLine(DataBlockScanner.java:788)
      at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.updateScanStatusInternal(DataBlockScanner.java:365)
      at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifiedByClient(DataBlockScanner.java:308)
      at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:205)
      at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
      at java.lang.Thread.run(Thread.java:619)
      2011-09-19 12:43:43,313 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block
      blk_7071818644980664768_40827. BlockInfo not found in volumeMap.
      2011-09-19 12:43:43,313 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block
      blk_7073840977856837621_62108. BlockInfo not found in volumeMap.

      Attachments

        1. HDFS-2359-branch-0.20-security.patch
          0.6 kB
          Jonathan Turner Eagles

        Activity

          People

            jeagles Jonathan Turner Eagles
            rajsaha Rajit Saha
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: