Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-4898

BlockPlacementPolicyWithNodeGroup.chooseRemoteRack() fails to properly fallback to local rack

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.2.0, 2.0.4-alpha
    • 1.3.0, 2.1.1-beta
    • namenode
    • None
    • Reviewed

    Description

      As currently implemented, BlockPlacementPolicyWithNodeGroup does not properly fallback to local rack when no nodes are available in remote racks, resulting in an improper NotEnoughReplicasException.

      BlockPlacementPolicyWithNodeGroup.java
        @Override
        protected void chooseRemoteRack(int numOfReplicas,
            DatanodeDescriptor localMachine, HashMap<Node, Node> excludedNodes,
            long blocksize, int maxReplicasPerRack, List<DatanodeDescriptor> results,
            boolean avoidStaleNodes) throws NotEnoughReplicasException {
          int oldNumOfReplicas = results.size();
          // randomly choose one node from remote racks
          try {
            chooseRandom(
                numOfReplicas,
                "~" + NetworkTopology.getFirstHalf(localMachine.getNetworkLocation()),
                excludedNodes, blocksize, maxReplicasPerRack, results,
                avoidStaleNodes);
          } catch (NotEnoughReplicasException e) {
            chooseRandom(numOfReplicas - (results.size() - oldNumOfReplicas),
                localMachine.getNetworkLocation(), excludedNodes, blocksize,
                maxReplicasPerRack, results, avoidStaleNodes);
          }
        }
      

      As currently coded the chooseRandom() call in the catch block will never succeed as the set of nodes within the passed in node path (e.g. /rack1/nodegroup1) is entirely contained within the set of excluded nodes (both are the set of nodes within the same nodegroup as the node chosen first replica).

      The bug is that the fallback chooseRandom() call in the catch block should be passing in the complement of the node path used in the initial chooseRandom() call in the try block (e.g. /rack1) - namely:

      NetworkTopology.getFirstHalf(localMachine.getNetworkLocation())
      

      This will yield the proper fallback behavior of choosing a random node from within the same rack, but still excluding those nodes in the same nodegroup

      Attachments

        1. h4898_20130809.patch
          2 kB
          Tsz-wo Sze
        2. h4898_20130809_b-1.patch
          1 kB
          Tsz-wo Sze

        Issue Links

          Activity

            People

              szetszwo Tsz-wo Sze
              sirianni Eric Sirianni
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: