Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Motivation
For now our rebalance fail-over is a pretty trivial infinite loop of retries:
- on the any issues on the catch up phase or later we call the onReconfigurationError listener
- for now this listener just count the retries and call changePeersAndLearnersAsync logic again and again
At the same time, we can call the resetPartitions logic and rewrite pending assignments, potentially at the any moment. So, we can have a race between rebalance retries and resetPartitions.
Definition of done
Under this ticket we need to investigate all possible issues, if any, and create appropriate issues to resolve.