Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-17110

Improve SimpleLoadBalancer to always take server-level balance into account

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.4, 2.0.0
    • 2.0.0
    • Balancer
    • None
    • Reviewed
    • Hide
      After HBASE-17110 the bytable strategy for SimpleLoadBalancer will also take server level balance into account
      Show
      After HBASE-17110 the bytable strategy for SimpleLoadBalancer will also take server level balance into account

    Description

      Currently with bytable strategy there might still be server-level imbalance and we will improve this in this JIRA.

      Some more background:
      When operating large scale clusters(our case), some companies still prefer to use SimpleLoadBalancer due to its simplicity, quick balance plan generation, etc. Current SimpleLoadBalancer has two modes:
      1. byTable, which only guarantees that the regions of one table could be uniformly distributed.
      2. byCluster, which ignores the distribution within tables and balance the regions all together.
      If the pressures on different tables are different, the first byTable option is the preferable one in most case. Yet, this choice sacrifice the cluster level balance and would cause some servers to have significantly higher load, e.g. 242 regions on server A but 417 regions on server B.(real world stats)
      Consider this case, a cluster has 3 tables and 4 servers:

        server A has 3 regions: table1:1, table2:1, table3:1
        server B has 3 regions: table1:2, table2:2, table3:2
        server C has 3 regions: table1:3, table2:3, table3:3
        server D has 0 regions.
      

      From the byTable strategy's perspective, the cluster has already been perfectly balanced on table level. But a perfect status should be like:

        server A has 2 regions: table2:1, table3:1
        server B has 2 regions: table1:2, table3:2
        server C has 3 regions: table1:3, table2:3, table3:3
        server D has 2 regions: table1:1, table2:2
      

      We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, table2 and table3 still keep balanced. And this is the goal this JIRA tries to achieve.

      Two UTs will be added as well with the last one demonstrating advantage of the new strategy. Also, a onConfigurationChange method will be implemented to hot control the "slop" variable.

      We have been using the strategy on our largest cluster for several months, so the effect could be assured to some extent.

      Attachments

        1. HBASE-17110-V8.patch
          38 kB
          Charlie Qiangeng Xu
        2. HBASE-17110-V7.patch
          38 kB
          Charlie Qiangeng Xu
        3. HBASE-17110-V6.patch
          36 kB
          Charlie Qiangeng Xu
        4. HBASE-17110-V5.patch
          35 kB
          Charlie Qiangeng Xu
        5. HBASE-17110-V4.patch
          35 kB
          Charlie Qiangeng Xu
        6. HBASE-17110-V3.patch
          37 kB
          Charlie Qiangeng Xu
        7. HBASE-17110-V2.patch
          36 kB
          Charlie Qiangeng Xu
        8. HBASE-17110.patch
          37 kB
          Yu Li

        Issue Links

          Activity

            People

              xharlie Charlie Qiangeng Xu
              xharlie Charlie Qiangeng Xu
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: