Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-23566

Investigate possible races between resetPartitions and infinite rebalance retries

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None

    Description

      Motivation
      For now our rebalance fail-over is a pretty trivial infinite loop of retries:

      • on the any issues on the catch up phase or later we call the onReconfigurationError listener
      • for now this listener just count the retries and call changePeersAndLearnersAsync logic again and again

      At the same time, we can call the resetPartitions logic and rewrite pending assignments, potentially at the any moment. So, we can have a race between rebalance retries and resetPartitions.

      Definition of done
      Under this ticket we need to investigate all possible issues, if any, and create appropriate issues to resolve.

      Attachments

        Activity

          People

            kgusakov Kirill Gusakov
            kgusakov Kirill Gusakov
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: