Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
While using zookeeper discovery mode, the problem that HS2 never knows deregistering from Zookeeper always happens.
Reproduction is simple.
- Find one of the zk servers which holds the DeRegisterWatcher watches of HS2 instances. If the version of ZK server is 3.5.0 or above, it's easily found with http://zk-server:8080/commands/watches (ZK AdminServer feature)
- Check which HS2 instance is watching on the ZK server found at 1, say it's hs2-of-2
- Restart the ZK server found at 1
- Deregister hs2-of-2 with the command
hive --service hiveserver2 -deregister hs2-of-2
- hs2-of-2 never knows that it must be shut down because the watch event of DeregisterWatcher was already fired at the time of 3.
The reason of the problem is explained at https://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#sc_WatchRememberThese
I added some logging to DeRegisterWatcher and checked what events were occurred at the time of 3(restarting of ZK server);
- WatchedEvent state:Disconnected type:None path:null
- WatchedEvent[WatchedEvent state:SyncConnected type:None path:null]
- WatchedEvent[WatchedEvent state:SaslAuthenticated type:None path:null]
- WatchedEvent[WatchedEvent state:SyncConnected type:NodeDataChanged
path:/hiveserver2/serverUri=hs2-of-2:10000;version=3.1.2;sequence=0000000711]
As the zk manual says, watches are one-time triggers. When the connection to the ZK server was reestablished, state:SyncConnected type:NodeDataChanged for the path is fired and it's the end. DeregisterWatcher must be registered again for the same znode to get a future NodeDeleted event.
Attachments
Issue Links
- links to