Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
The cluster id will set once we become the active master in finishActiveMasterInitialization, but the blockUntilBecomingActiveMaster and finishActiveMasterInitialization are both called in a thread to make the constructor of HMaster return without blocking. And since HMaster itself is also a HRegionServer, it will create a Connection and then start calling reportForDuty. And when creating the ConnectionImplementation, we will read the cluster id from zk, but the cluster id may have not been set yet since it is set in another thread, we will get an exception and use the default cluster id instead.
I always get this when running UTs which will start a mini cluster
2018-01-03 15:16:37,916 WARN [M:0;zhangduo-ubuntu:32848] client.ConnectionImplementation(528): Retrieve cluster id failed java.util.concurrent.ExecutionException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hbase/hbaseid at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:526) at org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:286) at org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.<init>(ConnectionUtils.java:141) at org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.<init>(ConnectionUtils.java:137) at org.apache.hadoop.hbase.client.ConnectionUtils.createShortCircuitConnection(ConnectionUtils.java:185) at org.apache.hadoop.hbase.regionserver.HRegionServer.createClusterConnection(HRegionServer.java:781) at org.apache.hadoop.hbase.regionserver.HRegionServer.setupClusterConnection(HRegionServer.java:812) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:827) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:938) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:550) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hbase/hbaseid at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:163) at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:311) ... 1 more
Attachments
Attachments
Issue Links
- is related to
-
HBASE-19831 Regions on the Master Redux
- Resolved
-
HBASE-21535 Zombie Master detector is not working
- Resolved
- relates to
-
HBASE-16367 Race between master and region server initialization may lead to premature server abort
- Resolved
- links to