[HBASE-6389] Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments - ASF JIRA

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 0.94.0, 0.95.2
Fix Version/s: 0.94.3, 0.95.0
Component/s: master
Labels:
None

Hadoop Flags:

Incompatible change, Reviewed
Release Note:

Hide
Reverts the cluster startup behavior to pre 0.94.0.

With this, Master will wait until "hbase.master.wait.on.regionservers.mintostart" number of Region Servers have registered with it before it starts region assignment. The default value of this setting is 1.

In large clusters with thousands of regions you may want to increase this to a higher number which is sufficient to handle the task of opening those region in parallel.

If left to the default, at times, the Master could assign all regions to a single Region Server which will result in slow startup and in worst case could OOM the Region Server (some time resulting in META inconsistency).

Here is how it works now (from the javadoc):

We wait until one of these condition is met:
- the master is stopped
- the 'hbase.master.wait.on.regionservers.maxtostart' number of region servers is reached
- the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time AND the 'hbase.master.wait.on.regionservers.timeout' is reached

Show
Reverts the cluster startup behavior to pre 0.94.0. With this, Master will wait until "hbase.master.wait.on.regionservers.mintostart" number of Region Servers have registered with it before it starts region assignment. The default value of this setting is 1. In large clusters with thousands of regions you may want to increase this to a higher number which is sufficient to handle the task of opening those region in parallel. If left to the default, at times, the Master could assign all regions to a single Region Server which will result in slow startup and in worst case could OOM the Region Server (some time resulting in META inconsistency). Here is how it works now (from the javadoc): We wait until one of these condition is met: - the master is stopped - the 'hbase.master.wait.on.regionservers.maxtostart' number of region servers is reached - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time AND the 'hbase.master.wait.on.regionservers.timeout' is reached

Description

It seems I was mistaken in my assumption that changing the value of "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from default of 1) can help prevent assignment of all regions to one (or a small number of) region server(s).

While this was the case in 0.90.x and 0.92.x, the behavior has changed in 0.94.0 onwards to address ~~HBASE-4993~~.

From 0.94.0 onwards, Master will proceed immediately after the timeout has lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not reached.

Reading the current conditions of waitForRegionServers() clarifies it

ServerManager.java (trunk rev:1360470)

....
581	  /**
582	   * Wait for the region servers to report in.
583	   * We will wait until one of this condition is met:
584	   *  - the master is stopped
585	   *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
586	   *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
587	   *    region servers is reached
588	   *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
589	   *   there have been no new region server in for
590	   *      'hbase.master.wait.on.regionservers.interval' time
591	   *
592	   * @throws InterruptedException
593	   */
594	  public void waitForRegionServers(MonitoredTask status)
595	  throws InterruptedException {
....
....
612	    while (
613	      !this.master.isStopped() &&
614	        slept < timeout &&
615	        count < maxToStart &&
616	        (lastCountChange+interval > now || count < minToStart)
617	      ){
....

So with the current conditions, the wait will end as soon as timeout is reached even lesser number of RS have checked-in with the Master and the master will proceed with the region assignment among these RSes alone.

As mentioned in ~~HBASE-4993~~, and I concur, this could have disastrous effect in large cluster especially now that MSLAB is turned on.

To enforce the required quorum as specified by "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, these conditions need to be modified as following

ServerManager.java

..
  /**
   * Wait for the region servers to report in.
   * We will wait until one of this condition is met:
   *  - the master is stopped
   *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
   *    region servers is reached
   *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
   *   there have been no new region server in for
   *      'hbase.master.wait.on.regionservers.interval' time AND
   *   the 'hbase.master.wait.on.regionservers.timeout' is reached
   *
   * @throws InterruptedException
   */
  public void waitForRegionServers(MonitoredTask status)
..
..
    int minToStart = this.master.getConfiguration().
    getInt("hbase.master.wait.on.regionservers.mintostart", 1);
    int maxToStart = this.master.getConfiguration().
    getInt("hbase.master.wait.on.regionservers.maxtostart", Integer.MAX_VALUE);
    if (maxToStart < minToStart) {
      maxToStart = minToStart;
    }
..
..
    while (
      !this.master.isStopped() &&
        count < maxToStart &&
        (lastCountChange+interval > now || timeout > slept || count < minToStart)
      ){
..

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-6389_0.94.patch
31/Oct/12 09:34
9 kB
Aditya Kishore
HBASE-6389_trunk_v2.patch
18/Oct/12 21:48
11 kB
Aditya Kishore
HBASE-6389_trunk_v2.patch
18/Oct/12 13:58
11 kB
Aditya Kishore
testReplication.jstack
20/Jul/12 01:37
204 kB
Ted Yu
org.apache.hadoop.hbase.TestZooKeeper-output.txt
19/Jul/12 22:50
120 kB
Ted Yu
HBASE-6389_trunk.patch
19/Jul/12 03:06
5 kB
Aditya Kishore
HBASE-6389_trunk.patch
13/Jul/12 20:53
5 kB
Aditya Kishore
HBASE-6389_trunk.patch
13/Jul/12 02:52
3 kB
Aditya Kishore

Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

Details

Description

Attachments

Attachments

Activity

People

Dates