Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
Reviewed
Description
Normally the most important purpose for HDFS balancer is to reduce the top used node to prevent datanode usage from being too high.
Currently, balancer almost randomly picks nodes as sources regardless of usage, which makes it slow to bring down the top used datanodes in the cluster, when there are less underutilized nodes in the cluster (consider expansion).
We can add an option to prefer top used nodes first in each iteration, as suggested in HDFS-14894 .
Attachments
Issue Links
- causes
-
HDFS-15904 Flaky test TestBalancer#testBalancerWithSortTopNodes()
- Resolved
- links to