[HADOOP-4643] NameNode should exclude excessive replicas when counting live replicas for a block - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.18.3
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

Currently NameNode include excessive replicas in blockMap and count them as live replicas. Although excessive replicas have marked as invalid, scheduling deletion may be delayed and also datanode does not send deletion confirmation until the next block report. As a result, excessive replicas may stay in blocksMap for quite a while. This may cause underReplicated blocks undetected in NameNode.

For example, assume that block b is at datanode d1, d2, and d3. We have the following scenario
1. d1 loses heartbeat, NN will replicate b to another datanode, assuming it is d4.
2. d1 comes back. NN finds out b is over-replicated therefore choosing one replica, assuming d4, as a excessive replica and marking it as invalid. Now b has 3 valid replicas d1, d2, d3 and 1 excessive (invalid) replica d4, all in blocksMap.
3. d2 loses heartbeat. d2 gets removed from blocksMap. Block b has 2 valid replicas d1 and d3, and 1 excessive invalid replica d4. Block b is under-replicated; But NN still counts block b has 3 live replicas so does not take any action to replicate block b.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

nodeCount.patch
14/Nov/08 00:42
13 kB
Hairong Kuang
nodeCount-br18.patch
14/Nov/08 00:04
13 kB
Hairong Kuang

Issue Links

is part of

HADOOP-4556 Block went missing

Closed

is related to

HDFS-37 An invalidated block should be removed from the blockMap

Resolved

Activity

People

Assignee:: Hairong Kuang

Reporter:: Hairong Kuang

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 12/Nov/08 19:56

Updated:: 08/Jul/09 16:43

Resolved:: 14/Nov/08 01:04