Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-1912

Leader election lets 2 leaders happen

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Not A Problem
    • 3.4.6
    • None
    • leaderElection
    • None
    • Ubuntu 12.04, OpenJDK 1.6

    Description

      In 3-node cluster, when there are 2 nodes die and reboot during leader election, it might lead to the case that there are 2 leaders happen in the system. Eventually, a leader that does not has follower supports and quit being leader, but it makes us lose some availability.

      I am building a tools that can reorder messages and disk write, and also inject node crash to the system and found this bug.
      These are the step of events that my tools execute in sequence that lead to 2 leaders at the end.
      My zookeeper nodes have id = 0,1,2

      packetsend from=0 to=1 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=0 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=1 to=0 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=1 to=2 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=1 to=0 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=0 to=1 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=1 to=2 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=0 to=2 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
      diskwrite nodeId=0 write=currentEpoch
      nodecrash id=0
      nodecrash id=1
      nodestart id=0
      nodestart id=1
      diskwrite nodeId=2 write=currentEpoch
      packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=0 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=0 to=1 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=1 to=0 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=1 to=2 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
      packetsend from=2 to=0 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
      packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
      packetsend from=0 to=2 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
      packetsend from=1 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
      packetsend from=1 to=2 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
      packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
      packetsend from=0 to=2 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
      packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
      packetsend from=0 to=2 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
      diskwrite nodeId=2 write=currentEpoch
      diskwrite nodeId=1 write=currentEpoch

      Attachments

        1. conf.zip
          0.9 kB
          Tanakorn Leesatapornwongsa
        2. log.zip
          15 kB
          Tanakorn Leesatapornwongsa

        Activity

          People

            fpj Flavio Paiva Junqueira
            tanakorn Tanakorn Leesatapornwongsa
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: