Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Currently we do not cancel pending AlterIsr requests after the state has been updated through a LeaderAndIsr request received from the controller. This leads to log messages such as this
[2021-08-23 18:12:47,317] WARN [Partition __transaction_state-32 broker=3] Failed to enqueue ISR change state LeaderAndIsr(leader=3, leaderEpoch=3, isUncleanLeader=false, isr=List(3, 1), zkVersion=3) for partition __transaction_state-32 (kafka.cluster.Partition)
I think the only complication here is protecting against the AlterIsr callback which is executed asynchronously. To address this, we can move the `zkVersion` field into `IsrState`. When the callback is invoked, we can the existing state against the response state to decide whether to apply the change.
Attachments
Issue Links
- is related to
-
KAFKA-12686 Race condition in AlterIsr response handling
-
- Resolved
-