Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-17565

StochasticLoadBalancer may incorrectly skip balancing due to skewed multiplier sum

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.4.0, 1.3.3, 2.0.0
    • None
    • None
    • Reviewed

    Description

      I was investigating why a 6 node cluster kept skipping balancing requests.

      Here were the region counts on the servers:
      449, 448, 447, 449, 453, 0

      2017-01-26 22:04:47,145 INFO  [RpcServer.deafult.FPBQ.Fifo.handler=1,queue=0,port=16000] balancer.StochasticLoadBalancer: Skipping load balancing because balanced cluster; total cost is 127.0171157050385, sum multiplier is 111087.0 min cost which need balance is 0.05
      

      The big multiplier sum caught my eyes. Here was what additional debug logging showed:

      2017-01-27 23:25:31,749 DEBUG [RpcServer.deafult.FPBQ.Fifo.handler=9,queue=0,port=16000] balancer.StochasticLoadBalancer: class org.apache.hadoop.hbase.master.balancer.          StochasticLoadBalancer$RegionReplicaHostCostFunction with multiplier 100000.0
      2017-01-27 23:25:31,749 DEBUG [RpcServer.deafult.FPBQ.Fifo.handler=9,queue=0,port=16000] balancer.StochasticLoadBalancer: class org.apache.hadoop.hbase.master.balancer.          StochasticLoadBalancer$RegionReplicaRackCostFunction with multiplier 10000.0
      

      Note however, that no table in the cluster used read replica.

      I can think of two ways of fixing this situation:

      1. If there is no read replica in the cluster, ignore the multipliers for the above two functions.
      2. When cost() returned by the CostFunction is 0 (or very very close to 0.0), ignore the multiplier.

      Attachments

        1. 17565.v1.txt
          1 kB
          Ted Yu
        2. 17565.v2.txt
          2 kB
          Ted Yu
        3. 17565.v3.txt
          3 kB
          Ted Yu
        4. 17565.v4.txt
          2 kB
          Ted Yu
        5. 17565.v5.txt
          2 kB
          Ted Yu
        6. 17565.v6.txt
          5 kB
          Ted Yu
        7. 17565.addendum
          1 kB
          Ted Yu

        Activity

          People

            yuzhihong@gmail.com Ted Yu
            yuzhihong@gmail.com Ted Yu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: