Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-3707

Leadership Election gets stuck in 5 node ensemble

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.5.5
    • None
    • leaderElection
    • None

    Description

      Scenario:
      1. 5 node ensemble-(SID 1,2,3,4,5). 5 is the current Leader.
      2. Test brings down 5's ZK process.
      3. Leadership election begins. First each SID votes itself to be the leader as expected.
      4. SID 1 and SID 2 gets notification from SID 3 before they get Notification from SID 4. They update their vote to propose 3 as the Leader as expected and send notifications.
      5. SID 3 receives the notification from 1, 2 and itself and its Election predicate is successfully terminated and it goes to LEADING state, comes out of FLE and goes to the next phase.
      6. SID 2 meantime goes to FOLLOWING state , comes out of FLE and goes to the next phase(NEWLEADER sending etc).

      so far so good.
      7. Meantime (somewhere after step 4) SID 1 receives notification from SID 4 and since SID 4 > SID 3(and zxid is same), SID 1 changes its mind and updates its proposal - now to elect 4 as leader and sends notification.
      8. SID 4 is trying to elect itself as leader. And even though SID 2 and SID 3 are out of election, the SID 4 can not get out of election because - not enough number of nodes are following 3(Only 1 is following 3).
      9. SID 2 is also stuck in FLE like SID 4.

      So, in summary SID 1 and 4 are stuck in FLE (in lookForLeader()) and SID 2 and SID 3 are stuck in the next phase because SID 3's NEWLEADER is not responded by the quorum.

      Attachments

        Activity

          People

            Unassigned Unassigned
            suhas.dantkale Suhas Dantkale
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: