[HDFS-1480] All replicas of a block can end up on the same rack when some datanodes are decommissioning. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.20.2
Fix Version/s: 0.23.0
Component/s: namenode
Labels:
None

Hadoop Flags:

Reviewed

Description

It appears that all replicas of a block can end up in the same rack. The likelihood of such replicas seems to be directly related to decommissioning of nodes.

Post rolling OS upgrade (decommission 3-10% of nodes, re-install etc, add them back) of a running cluster, all replicas of about 0.16% of blocks ended up in the same rack.

Hadoop Namenode UI etc doesn't seem to know about such incorrectly replicated blocks. "hadoop fsck .." does report that the blocks must be replicated on additional racks.

Looking at ReplicationTargetChooser.java, following seem suspect:

snippet-01:

    int maxNodesPerRack =
      (totalNumOfReplicas-1)/clusterMap.getNumOfRacks()+2;

snippet-02:

      case 2:
        if (clusterMap.isOnSameRack(results.get(0), results.get(1))) {
          chooseRemoteRack(1, results.get(0), excludedNodes,
                           blocksize, maxNodesPerRack, results);
        } else if (newBlock){
          chooseLocalRack(results.get(1), excludedNodes, blocksize,
                          maxNodesPerRack, results);
        } else {
          chooseLocalRack(writer, excludedNodes, blocksize,
                          maxNodesPerRack, results);
        }
        if (--numOfReplicas == 0) {
          break;
        }

snippet-03:

    do {
      DatanodeDescriptor[] selectedNodes =
        chooseRandom(1, nodes, excludedNodes);
      if (selectedNodes.length == 0) {
        throw new NotEnoughReplicasException(
                                             "Not able to place enough replicas");
      }
      result = (DatanodeDescriptor)(selectedNodes[0]);
    } while(!isGoodTarget(result, blocksize, maxNodesPerRack, results));

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hdfs-1480.txt
19/Aug/11 23:27
22 kB
Todd Lipcon
hdfs-1480.txt
18/Aug/11 01:25
23 kB
Todd Lipcon
hdfs-1480.txt
16/Aug/11 00:20
22 kB
Todd Lipcon
hdfs-1480-test.txt
24/Jun/11 02:37
3 kB
Todd Lipcon

Issue Links

relates to

HDFS-15 Rack replication policy can be violated for over replicated blocks

Closed

Activity

People

Assignee:: Todd Lipcon

Reporter:: T Meyarivan

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 26/Oct/10 23:49

Updated:: 24/Aug/11 13:48

Resolved:: 23/Aug/11 22:04