Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Sprint #4 10/2 - 10/16
Description
FINALIZE callbacks are sent async via CallbackHandler#reset(), while Zk callbacks are queued in ZkEventThread. It's possible that we are handling a FINALIZE callback before all Zk callbacks are cleaned up. This creates race conditions, for example, in zk session expiry, when a GenericController gets a FINALIZE callback, it cleans up all listeners using ZkClient#unsubscribe(), but Zk callbacks leftover in ZkEventThread comes later, and re-subscribe all listeners, causing zk watcher leaking.
This is observed by setting up two controllers and expire the leader (by simulating a long gc). The second controller takes the leadership and add all listeners, but when the former leader recovers from gc, it gets leftover Zk callbacks and re-subscribe the live-instance listener hence react to all live-instance changes, though it doesn't acquire the leadership.