Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.20.205.0
-
None
-
Reviewed
Description
Scenario:
I have a cluster of 4 DN ,each of them have 12disks.
In hdfs-site.xml I have "dfs.datanode.failed.volumes.tolerated=3"
During the execution of distcp (hdfs->hdfs), I am failing 3 disks in one Datanode, by making Data Directory permission 000, The distcp job is successful but , I am getting some NullPointerException in Datanode log
In one thread
$hadoop distcp /user/$HADOOPQA_USER/data1 /user/$HADOOPQA_USER/data3
In another thread in a datanode
$ chmod 000 /xyz/
/hadoop/var/hdfs/data
where [ dfs.data.dir is set as /xyz/
{0..11}/hadoop/var/hdfs/data ]
Log Snippet from the Datanode
=============
2011-09-19 12:43:40,314 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block
blk_7065198814142552283_62557. BlockInfo not found in volumeMap.
2011-09-19 12:43:40,314 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block
blk_7066946313092770579_39189. BlockInfo not found in volumeMap.
2011-09-19 12:43:40,314 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block
blk_7070305189404753930_49359. BlockInfo not found in volumeMap.
2011-09-19 12:43:40,327 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Error processing datanode Command
java.io.IOException: Error in deleting blocks.
at org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:1820)
at org.apache.hadoop.hdfs.server.datanode.DataNode.processCommand(DataNode.java:1074)
at org.apache.hadoop.hdfs.server.datanode.DataNode.processCommand(DataNode.java:1036)
at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:891)
at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1419)
at java.lang.Thread.run(Thread.java:619)
2011-09-19 12:43:41,304 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(xx.xxx.xxx.xxx:xxxx, storageID=xx-xxxxxxxxxxxx-xx.xxx.xxx.xxx-xxxx-xxxxxxxxxxx, infoPort=1006,
ipcPort=8020):DataXceiver
java.lang.NullPointerException
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner$LogFileHandler.appendLine(DataBlockScanner.java:788)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.updateScanStatusInternal(DataBlockScanner.java:365)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifiedByClient(DataBlockScanner.java:308)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:205)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
at java.lang.Thread.run(Thread.java:619)
2011-09-19 12:43:43,313 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block
blk_7071818644980664768_40827. BlockInfo not found in volumeMap.
2011-09-19 12:43:43,313 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block
blk_7073840977856837621_62108. BlockInfo not found in volumeMap.