Uploaded image for project: 'Ratis'
  1. Ratis
  2. RATIS-1247 Support rolling upgrade and rollback
  3. RATIS-1770

Yield leader to higher priority peer by TransferLeadership

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 3.0.0, 2.5.0
    • server
    • None

    Description

      Followup RATIS-1762.

      There might be race conditions between priority-based YieldLeadership and user-requested TransferLeadership. For example:

      Node Role Priority
      Peer 1 Leader 0
      Peer 2 Follower 1
      Peer 3 Follower 1

      If user requested TransferLeadership to peer 3, while the YieldLeadership found peer 2 has higher priority than current leader.
      Peer 1 will send StartLeaderElection to both peer 2 and peer 3, and there might be a race condition (although it's benign).

      One immediate thought is to use the new TransferLeadership to yield leadership to higher priority peer.
      But it may cause following problems as quoted:

      If the higher priority peer lags behind a lot, it may take some time to catch up the latest transaction. If the prior leader reject client requests, then the service may be unavailable for a long time.

      To solve this problem, the old leader should only start TransferLeadership iff the higher priority peer is up-to-date.

      Attachments

        1. 845_review.patch
          18 kB
          Tsz-wo Sze
        2. 845_review2.patch
          22 kB
          Tsz-wo Sze
        3. 845_review3.patch
          21 kB
          Tsz-wo Sze

        Activity

          People

            ckj Kaijie Chen
            ckj Kaijie Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 5h
                5h