[HDFS-3493] Invalidate excess corrupted blocks as long as minimum replication is satisfied - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.0-alpha, 2.0.5-alpha
Fix Version/s: 2.5.0
Component/s: namenode
Labels:
None

Description

replication factor= 3, block report interval= 1min and start NN and 3DN

Step 1:Write a file without close and do hflush (Dn1,DN2,DN3 has blk_ts1)
Step 2:Stopped DN3
Step 3:recovery happens and time stamp updated(blk_ts2)
Step 4:close the file
Step 5:blk_ts2 is finalized and available in DN1 and Dn2
Step 6:now restarted DN3(which has got blk_ts1 in rbw)

From the NN side there is no cmd issued to DN3 to delete the blk_ts1 . But ask DN3 to make the block as corrupt .
Replication of blk_ts2 to DN3 is not happened.

NN logs:
========

INFO org.apache.hadoop.hdfs.StateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_3927215081484173742 to add as corrupt on XX.XX.XX.XX:50276 by /XX.XX.XX.XX because reported RWR replica with genstamp 1007 does not match COMPLETE block's genstamp in block map 1008
INFO org.apache.hadoop.hdfs.StateChange: BLOCK* processReport: from DatanodeRegistration(XX.XX.XX.XX, storageID=DS-443871816-XX.XX.XX.XX-50276-1336829714197, infoPort=50275, ipcPort=50277, storageInfo=lv=-40;cid=CID-e654ac13-92dc-4f82-a22b-c0b6861d06d7;nsid=2063001898;c=0), blocks: 2, processing time: 1 msecs
INFO org.apache.hadoop.hdfs.StateChange: BLOCK* Removing block blk_3927215081484173742_1008 from neededReplications as it has enough replicas.

INFO org.apache.hadoop.hdfs.StateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_3927215081484173742 to add as corrupt on XX.XX.XX.XX:50276 by /XX.XX.XX.XX because reported RWR replica with genstamp 1007 does not match COMPLETE block's genstamp in block map 1008
INFO org.apache.hadoop.hdfs.StateChange: BLOCK* processReport: from DatanodeRegistration(XX.XX.XX.XX, storageID=DS-443871816-XX.XX.XX.XX-50276-1336829714197, infoPort=50275, ipcPort=50277, storageInfo=lv=-40;cid=CID-e654ac13-92dc-4f82-a22b-c0b6861d06d7;nsid=2063001898;c=0), blocks: 2, processing time: 1 msecs
WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not able to place enough replicas, still in need of 1 to reach 1
For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy

fsck Report
===========

/file21:  Under replicated BP-1008469586-XX.XX.XX.XX-1336829603103:blk_3927215081484173742_1008. Target Replicas is 3 but found 2 replica(s).
.Status: HEALTHY
 Total size:	495 B
 Total dirs:	1
 Total files:	3
 Total blocks (validated):	3 (avg. block size 165 B)
 Minimally replicated blocks:	3 (100.0 %)
 Over-replicated blocks:	0 (0.0 %)
 Under-replicated blocks:	1 (33.333332 %)
 Mis-replicated blocks:		0 (0.0 %)
 Default replication factor:	1
 Average block replication:	2.0
 Corrupt blocks:		0
 Missing replicas:		1 (14.285714 %)
 Number of data-nodes:		3
 Number of racks:		1
FSCK ended at Sun May 13 09:49:05 IST 2012 in 9 milliseconds
The filesystem under path '/' is HEALTHY

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-3493.002.patch
05/Jun/14 22:00
6 kB
Juan Yu
HDFS-3493.003.patch
06/Jun/14 23:11
9 kB
Juan Yu
HDFS-3493.004.patch
11/Jun/14 18:25
8 kB
Juan Yu
HDFS-3493.patch
16/Aug/13 05:35
5 kB
Vinayakumar B

Issue Links

duplicates

HDFS-3586 Blocks are not getting replicate even DN's are availble.

Resolved

is duplicated by

HDFS-2932 Under replicated block after the pipeline recovery.

Resolved

is related to

HDFS-3586 Blocks are not getting replicate even DN's are availble.

Resolved

HDFS-2290 Block with corrupt replica is not getting replicated

Closed

Activity

People

Assignee:: Juan Yu

Reporter:: J.Andreina

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 04/Jun/12 05:01

Updated:: 30/Apr/15 15:09

Resolved:: 12/Jun/14 21:18