Resolution: Not A Bug
Affects Version/s: 3.4.6
Fix Version/s: None
In my projects, I use three ZooKeeper server as an ensemble:
zk1 as a follower on 192.168.25.221,
zk2 as a follower on 192.168.25.222,
zk3 as the leader on 192.168.25.223.
My two programs using ZooKeepers C client run on 192.168.25.221 and 192.168.25.222.
When watched the ZOO_CONNECTED_STATE, my program will use the zookeeper to obtain a lock do the following:
1. Create a ZOO_EPHEMERAL | ZOO_SEQUENCE node under '/Lock/'.
2. Call getChildren( ) on the '/Lock/' node.
3. If the pathname created in step 1 has the lowest sequence number suffix, the program has the lock and do something,then release the lock simply delete the node created in step 1.
4. The program calls exists() with the watch flag set on the lowest sequence number node.
5. if exists( ) returns false, go to step 2. Otherwise, wait for a notification(ZOO_DELETED_EVENT) for the pathname from the previous step before going to step 2.
When I stop a follower such as zk1/zk2, everything is ok, my programs on 192.168.25.221 and 192.168.25.222 do its work orderly under the lock's control.
When I stop the leader such as zk3(I have restarted zk1/zk2), my program on 192.168.25.221 got the lock and release it normally, and my program on 192.168.25.222 detected existence of the node
created by the program on 192.168.25.221, but keep waiting and can't receive the ZOO_DELETED_EVENT notification.
Does anyone else see the same problem？
1. The attachment is the log of the zookeeper on 192.168.25.221 and 192.168.25.222 when I stop the leader on 192.168.25.223
2. Actually I have other more programs using ZooKeepers C client run on 192.168.25.221, 192.168.25.222 and 192.168.25.223.
3. The system time on 192.168.25.221 is slower 1 minute and 33 seconds than 192.168.25.222 and 192.168.25.223. so when I stop the leader, it's 2016-05-28 22:33:34 on 192.168.25.221 and 2016-05-28 22:35:07 on 192.168.25.222.