Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-16011

Start new rebalance round, when partition assignments updated

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0-alpha5
    • None

    Description

      When partition assignments are updated, we need to start raft changePeers and handle failover scenarios.

      When metastore event about partition assignments updates received we need to:

      • Start all needed nodes 
         partition.assignments.pending / partition.assignments.stable
      • After successful starts - check if current node is the leader of raft group (leader response must be updated by current term) and changePeers(leaderTerm, peers). changePeers from old terms must be skipped.

      Also, we need the propagation of some new events from the raft side:

      • onLeaderElected(boolean configurationChangeInProgress) - must be executed from the new leader when raft group changes the leader. Maybe we actually need to also check if a new lease is received - we need to investigate.
      • onChangePeersError(errorContext) - must be executed when any errors during changePeers occurred
      • onChangePeersCommitted(peers) - must be executed with the list of new peers when changePeers has successfully done.

      and handle them by appropriate way:

      • onLeaderElected(configurationChangeInProgress) - we need to:
        • if configurationChangeInProgress == false and pending/planned assignments not empty - run new changePeers. If true, do nothing.}}
      • onChangePeersError(errorContext) - run failover logic
      • onChangePeersCommitted(peers) - check if planned assignments is not empty and move it to pending.
        • Update pending and stable partitions assignments:
          metastoreInvoke: \\ atomic
              // Here we can check invariant that pending is empty, but planned is not is impossible
              partition.assignments.stable = appliedPeers
              if empty(partition.assignments.planned):
                  partition.assignments.pending = empty
              else:
                  partition.assignments.pending = partition.assignments.planned 

      When partition.assignments.stable updated, we need to:

      • Replace current raft client with new one, with appropriate peers
      • Stop unneeded raft node
         

      (Phase 1)

      Attachments

        Issue Links

          Activity

            People

              kgusakov Kirill Gusakov
              kgusakov Kirill Gusakov
              Alexander Lapin Alexander Lapin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 8h
                  8h