Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.92.2, 0.94.1, 0.95.2
-
None
-
Reviewed
-
Rather than exit, the regionserver will now wait even though the root directory in zookeeper has yet to be created.
Description
When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time.
Master startup code is smt like this:
- establish zk connection
- create root znodes in zk (/hbase)
- create ephemeral node for master /hbase/master,
Region server start up code is smt like this:
- establish zk connection
- check whether the root znode (/hbase) is there. If not, shutdown.
- wait for the master to create znodes /hbase/master
So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs.