Description
When there are a sequence of LeaderAndISR or StopReplica requests sent from different controllers causing the group coordinator to elect / resign, we may re-order the events due to race condition. For example:
1) First LeaderAndISR request received from old controller to resign as the group coordinator.
2) Second LeaderAndISR request received from new controller to elect as the group coordinator.
3) Although threads handling the 1/2) requests are synchronized on the replica manager, their callback onLeadershipChange would trigger onElection/onResignation which would schedule the loading / unloading on background threads, and are not synchronized.
4) As a result, the onElection maybe triggered by the thread first, and then onResignation. As a result, the coordinator would not recognize it self as the coordinator and hence would respond any coordinator request with NOT_COORDINATOR.
Here are two proposals on top of my head:
1) Let the scheduled load / unload function to keep the passed in leader epoch, and also materialize the epoch in memory. Then when execute the unloading check against the leader epoch.
2) This may be a bit simpler: using a single background thread working on a FIFO queue of loading / unloading jobs, since the caller are actually synchronized on replica manager and order preserved, the enqueued loading / unloading job would be correctly ordered as well. In that case we would avoid the reordering.
Attachments
Issue Links
- links to