HBase
  1. HBase
  2. HBASE-3945

Load balancer shouldn't move the same region in two consective balancing actions

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Keeping a region on the same region server would give good stability for active scanners.
      We shouldn't reassign the same region in two successive calls to balanceCluster().

        Activity

        Hide
        Ted Yu added a comment -

        Motivation for this JIRA was to reduce disruption to long running compactions.
        Since the decision of compaction is solely made by region server, it is not easy for load balancer to know the exact timing and duration of compactions.
        Shall we introduce new parameter, e.g. hbase.balancer.inert.duration, specifying the duration of keeping region on the same region server ?

        Show
        Ted Yu added a comment - Motivation for this JIRA was to reduce disruption to long running compactions. Since the decision of compaction is solely made by region server, it is not easy for load balancer to know the exact timing and duration of compactions. Shall we introduce new parameter, e.g. hbase.balancer.inert.duration, specifying the duration of keeping region on the same region server ?
        Hide
        Jonathan Gray added a comment -

        I worry about this approach of more and more knobs, especially when they don't directly address what a good/bad load balance really is.

        If a region gets moved in two consecutive balancing actions, then something is wrong with the balancer in the first place. While I agree in principle that regions moving multiple times and quickly is not desirable, this will be a common outcome if the balancing algorithm isn't already taking into account metrics over time (rather than short snapshots). If we're using load but then adding all these limits/controls, it's hard to ever understand the behavior of the balancer.

        Show
        Jonathan Gray added a comment - I worry about this approach of more and more knobs, especially when they don't directly address what a good/bad load balance really is. If a region gets moved in two consecutive balancing actions, then something is wrong with the balancer in the first place. While I agree in principle that regions moving multiple times and quickly is not desirable, this will be a common outcome if the balancing algorithm isn't already taking into account metrics over time (rather than short snapshots). If we're using load but then adding all these limits/controls, it's hard to ever understand the behavior of the balancer.
        Hide
        Ted Yu added a comment -

        @Jonathan:
        I agree with your comment.
        I think we should use as few knobs as possible.

        Stack suggested this approach for the problem reported by Schubert and Anty. See his comment in HBASE-3943.

        Show
        Ted Yu added a comment - @Jonathan: I agree with your comment. I think we should use as few knobs as possible. Stack suggested this approach for the problem reported by Schubert and Anty. See his comment in HBASE-3943 .
        Hide
        Ted Yu added a comment -

        Without new parameter, we can keep any region on its region server for at least two cycles of balancing.
        This somehow relates hbase.balancer.period with the expected duration of compaction(s).

        Or maybe I misinterpreted Stack's comment.

        Show
        Ted Yu added a comment - Without new parameter, we can keep any region on its region server for at least two cycles of balancing. This somehow relates hbase.balancer.period with the expected duration of compaction(s). Or maybe I misinterpreted Stack's comment.
        Hide
        Lars Hofhansl added a comment -

        @Ted: Do you want to keep this open?

        Show
        Lars Hofhansl added a comment - @Ted: Do you want to keep this open?
        Hide
        Ted Yu added a comment -

        Let's keep this open for a while.
        At least I expect some unit test what can tell us such consecutive moves wouldn't happen.

        Show
        Ted Yu added a comment - Let's keep this open for a while. At least I expect some unit test what can tell us such consecutive moves wouldn't happen.

          People

          • Assignee:
            Unassigned
            Reporter:
            Ted Yu
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development