Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.23.0, 2.0.0-alpha
-
None
-
Reviewed
Description
Cluster setup:
1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" 300,"dfs.datanode.directoryscan.interval" 1
step 1: write one file "a.txt" with sync(not closed)
step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which replication happened.
step 3: close the file.
Since the replication factor is 2 the blocks are replicated to the other datanode.
Then at the NN side the following cmd is issued to DN from which the block is deleted
-------------------------------------------------------------------------------------
2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX because reported RBW replica with genstamp 1002 does not match COMPLETE block's genstamp in block map 1003 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* Removing block blk_2903555284838653156_1003 from neededReplications as it has enough replicas.
From the datanode side in which the block is deleted the following exception occured
2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block blk_2903555284838653156_1003. BlockInfo not found in volumeMap. 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Error processing datanode Command java.io.IOException: Error in deleting blocks. at org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) at java.lang.Thread.run(Thread.java:619)
Attachments
Attachments
Issue Links
- is related to
-
HDFS-3391 TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
- Closed