Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
When a member gets fenced, it triggers the onPartitionsLost callback if any, and then rejoins the group. If while the callback completes the member attempts to leave the group (ex. unsubscribe), the leave operation detects that the member is already removed from the group (fenced), and just aligns the client state with the current broker state, and marks the client as UNSUBSCRIBED (client side state for not in group).
This means that the member could attempt to rejoin the group if the user calls subscribe, get an assignment, and trigger onPartitionsAssigned, when maybe the onPartitionsLost hasn't completed.
This approach keeps the client state machine simple given that it does not need to block the new member (it will effectively be a new member because the old one got fenced). The new member could rejoin, get an assignment and make progress. Downside is that it would potentially allow for overlapped callback executions (lost and assign) in the above edge case, which is not the behaviour in the old coordinator. Review and validate. Alternative would definitely require more complex logic on the client to ensure that we do not allow a new member to rejoin until the fenced one completes the callback