Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4051

Leader did not lose the quorum when a node left the quorum of 3 out of 5 nodes.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.5.7
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      This Zookeeper ensemble has 5 nodes: node 1, 2, 3, 4 and 5. At the time, leader was node 4. The quorum consisted of node 3, 4 and 5. Node 1 and 2 kept disconnecting from node 4, so they never joined the quorum.

       

      At 2020-12-08 14:10:23, lost its quorum. But this only occurred after node 3 disconnected from node 4 multiple times. The disconnection message from node 3 had occurred more for more than 12 hours prior to this (but no logs prior to that). But the quorum was not lost. Following is form node 4. It shows that when the quorum was lost, there were only 2 nodes left in the quorum: node 4 and 5.

       

      Note that all IP addresses are replaced with 0.0.0.0 to allow it to be included in this bug report.

       

      [2020-12-08 14:10:20,702] INFO Notification: 2 (message format version), 2 (n.leader), 0x503300000000 (n.zxid), 0x3c (n.round), LOOKING (n.state), 2 (n.sid), 0x5033 (n.peerEPoch), LEADING (my state)0 (n.config version) (org.apache.zookeeper.server.quorum.FastLeaderElection)
      [2020-12-08 14:10:20,918] INFO Notification: 2 (message format version), 2 (n.leader), 0x503300000000 (n.zxid), 0x3c (n.round), LOOKING (n.state), 2 (n.sid), 0x5033 (n.peerEPoch), LEADING (my state)0 (n.config version) (org.apache.zookeeper.server.quorum.FastLeaderElection)
      [2020-12-08 14:10:21,045] INFO Received connection request 0.0.0.0:41966 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      [2020-12-08 14:10:21,056] INFO Notification: 2 (message format version), 2 (n.leader), 0x503300000000 (n.zxid), 0x3d (n.round), LOOKING (n.state), 2 (n.sid), 0x5033 (n.peerEPoch), LEADING (my state)0 (n.config version) (org.apache.zookeeper.server.quorum.FastLeaderElection)
      [2020-12-08 14:10:21,193] INFO Notification: 2 (message format version), 1 (n.leader), 0x420400000004 (n.zxid), 0x3d (n.round), LOOKING (n.state), 1 (n.sid), 0x5033 (n.peerEPoch), LEADING (my state)0 (n.config version) (org.apache.zookeeper.server.quorum.FastLeaderElection)
      [2020-12-08 14:10:21,193] INFO Notification: 2 (message format version), 1 (n.leader), 0x420400000004 (n.zxid), 0x3d (n.round), LOOKING (n.state), 1 (n.sid), 0x5033 (n.peerEPoch), LEADING (my state)0 (n.config version) (org.apache.zookeeper.server.quorum.FastLeaderElection)
      [2020-12-08 14:10:21,204] INFO Notification: 2 (message format version), 2 (n.leader), 0x503300000000 (n.zxid), 0x3d (n.round), LOOKING (n.state), 1 (n.sid), 0x5033 (n.peerEPoch), LEADING (my state)0 (n.config version) (org.apache.zookeeper.server.quorum.FastLeaderElection)
      [2020-12-08 14:10:22,180] INFO Received connection request 0.0.0.0:41970 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      [2020-12-08 14:10:22,181] WARN Connection broken for id 3, my id = 4, error =  (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      java.net.SocketException: Socket closed
              at java.net.SocketInputStream.socketRead0(Native Method)
              at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
              at java.net.SocketInputStream.read(SocketInputStream.java:171)
              at java.net.SocketInputStream.read(SocketInputStream.java:141)
              at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
              at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
              at java.io.DataInputStream.readInt(DataInputStream.java:387)
              at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1212)
      [2020-12-08 14:10:22,181] WARN Interrupting SendWorker (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      [2020-12-08 14:10:22,182] ERROR Failed to send last message. Shutting down thread. (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      java.net.SocketException: Socket closed
              at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
              at java.net.SocketOutputStream.write(SocketOutputStream.java:134)
              at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
              at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:1088)
              at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1115)
      [2020-12-08 14:10:22,182] WARN Send worker leaving thread  id 3 my id = 4 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      [2020-12-08 14:10:22,402] INFO Received connection request 0.0.0.0:41974 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      [2020-12-08 14:10:22,403] ERROR Failed to send last message. Shutting down thread. (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      java.net.SocketException: Socket closed
              at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
              at java.net.SocketOutputStream.write(SocketOutputStream.java:134)
              at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
              at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:1088)
              at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1115)
      [2020-12-08 14:10:22,403] WARN Send worker leaving thread  id 3 my id = 4 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      [2020-12-08 14:10:22,404] WARN Interrupting SendWorker (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      [2020-12-08 14:10:22,829] INFO Received connection request 0.0.0.0:51666 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      [2020-12-08 14:10:22,830] WARN Send worker leaving thread  id 3 my id = 4 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
      [2020-12-08 14:10:22,830] INFO Notification: 2 (message format version), 2 (n.leader), 0x503300000000 (n.zxid), 0x3d (n.round), LOOKING (n.state), 3 (n.sid), 0x5033 (n.peerEPoch), LEADING (my state)0 (n.config version) (org.apache.zookeeper.server.quorum.FastLeaderElection)
      [2020-12-08 14:10:22,975] INFO Notification: 2 (message format version), 2 (n.leader), 0x503300000000 (n.zxid), 0x3d (n.round), LOOKING (n.state), 3 (n.sid), 0x5033 (n.peerEPoch), LEADING (my state)0 (n.config version) (org.apache.zookeeper.server.quorum.FastLeaderElection)
      [2020-12-08 14:10:23,443] INFO Shutting down (org.apache.zookeeper.server.quorum.Leader)
      [2020-12-08 14:10:23,443] INFO Shutdown called (org.apache.zookeeper.server.quorum.Leader)
       
      java.lang.Exception: shutdown Leader! reason: Not sufficient followers synced, only synced with sids: [ [4, 5],[4, 5] ]
       
              at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:682)
       
              at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:662)
       
              at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1266)
       
      [2020-12-08 14:10:23,444] INFO exception while shutting down acceptor: java.net.SocketException: Socket closed (org.apache.zookeeper.server.quorum.Leader)
      [2020-12-08 14:10:23,448] INFO shutting down (org.apache.zookeeper.server.ZooKeeperServer)
      [2020-12-08 14:10:23,448] INFO Shutting down (org.apache.zookeeper.server.SessionTrackerImpl)
      [2020-12-08 14:10:23,448] INFO Shutting down (org.apache.zookeeper.server.quorum.LeaderRequestProcessor)
      [2020-12-08 14:10:23,449] INFO Shutting down (org.apache.zookeeper.server.PrepRequestProcessor)
      [2020-12-08 14:10:23,449] INFO Shutting down (org.apache.zookeeper.server.quorum.ProposalRequestProcessor)
      [2020-12-08 14:10:23,449] INFO Shutting down (org.apache.zookeeper.server.quorum.CommitProcessor)
      [2020-12-08 14:10:23,449] INFO CommitProcessor exited loop! (org.apache.zookeeper.server.quorum.CommitProcessor)
      [2020-12-08 14:10:23,449] INFO PrepRequestProcessor exited loop! (org.apache.zookeeper.server.PrepRequestProcessor)
      [2020-12-08 14:10:23,455] INFO Shutting down (org.apache.zookeeper.server.quorum.Leader)
      [2020-12-08 14:10:23,455] INFO shutdown of request processor complete (org.apache.zookeeper.server.FinalRequestProcessor)
      [2020-12-08 14:10:23,455] INFO Shutting down (org.apache.zookeeper.server.SyncRequestProcessor)
      [2020-12-08 14:10:23,455] INFO SyncRequestProcessor exited! (org.apache.zookeeper.server.SyncRequestProcessor)
      [2020-12-08 14:10:23,530] WARN PeerState set to LOOKING (org.apache.zookeeper.server.quorum.QuorumPeer)
      [2020-12-08 14:10:23,530] INFO LOOKING (org.apache.zookeeper.server.quorum.QuorumPeer)
      [2020-12-08 14:10:23,531] WARN ******* GOODBYE /0.0.0.0:46138 ******** (org.apache.zookeeper.server.quorum.LearnerHandler)
      
      
      

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              badai Badai Aqrandista
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: