Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-3663

The starvation problem in current load balance algorithm

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Invalid
    • None
    • None
    • Balancer

    Description

      This is an interesting starvation case. There are 2 conditions to trigger this problem.
      Condition1: r/s - r/(s+1) << 1
      Let r: the number of regions
      Let s: the number of servers

      Condition2: for each server, the load of each server is less or equal the ceil of avg load.

      Here is the unit test to verify this problem:
      For example, there are 16 servers and 62 regions. The avg load is
      3.875. And setting the slot to 0 to keep the load of each server either 3 or 4.
      When a new server is coming, no server needs to assign regions to this new server, since no one is larger the ceil of the avg.
      (Setting slot to 0 is to easily trigger this situation, otherwise it needs much larger numbers)

      Solutions is pretty straightforward. Just compare the floor of the avg instead of the ceil. This solution will evenly balance the load from the servers which is little more loaded than others.

      I also attached the comparison result for the case mentioned above between the old balance algorithm and new balance algorithm. (I set the slot = 0 when testing)

      Attachments

        1. result_old_load_balance.txt
          12 kB
          Liyin Tang
        2. result_new_load_balance.txt
          8 kB
          Liyin Tang
        3. HBASE_3665[0.89].patch
          8 kB
          Liyin Tang

        Activity

          People

            Unassigned Unassigned
            liyin Liyin Tang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: