Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.
Attachments
Attachments
Issue Links
- is duplicated by
-
HDFS-3496 Update balancer for correctness with NodeGroup - two replicas are not on the same node
- Resolved
-
HDFS-3497 Update Balancer policy with NodeGroup layer
- Resolved
- is part of
-
HADOOP-8468 Umbrella of enhancements to support different failure and locality topologies
- Resolved
- is related to
-
HDFS-3942 Backport HDFS-3495: Update balancer policy for Network Topology with additional 'NodeGroup' layer
- Closed
-
HDFS-4234 Use the generic code for choosing datanode in Balancer
- Closed
- relates to
-
HDFS-3619 isGoodBlockCandidate() in Balancer is not handling properly if replica factor >3
- Resolved