Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.0.0-alpha-1
-
None
Description
In the log, balancer logs at info level at the beginning of run:
balancer.StochasticLoadBalancer: start StochasticLoadBalancer.balancer, initCost=277.3479243125063, functionCost=RegionCountSkewCostFunction : (500.0, 0.3749771215224234); ServerLocalityCostFunction : (25.0, 0.5807483226644186); RackLocalityCostFunction : (15.0, 0.0); TableSkewCostFunction : (1000.0, 0.0019704142954972883); StoreFileCostFunction : (200.0, 0.3668512059459341); computedMaxSteps: 42270438200
the cost is reported without context, it is hard for operator to understand how unbalanced the cluster is for balancer and how much progress we are making.
For a large cluster, the calculation can take a long time, we also need to let operator understand that it will take up to the max time to complete the calculation.
At the end of computation:
balancer.StochasticLoadBalancer: Finished computing new load balance plan. Computation took PT40M0.006S to try 1036409 different iterations. Found a solution that moves 161926 regions; Going from a computed cost of 118.75715593924485 to a new cost of 1.5509126920967042
The time to compute the plan is also printed in a format that is not human readable. we also need to let operator understand that balancer is just submitting the plan and it be up to execution to complete the move.
Attachments
Issue Links
- is related to
-
HBASE-24528 Improve balancer decision observability
- Resolved
- links to