Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-3828

zookeeper clients gets connection timeout when the leader node is restarted

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.5.5, 3.5.7, 3.6.1, 3.5.8
    • None
    • java client
    • None

    Description

      I have configured 5 nodes zookeeper cluster using 3.6.1 version in a docker containerized environment. As a part of some destructive testing, I restarted zookeeper leader. Now, re-election happened and all 5 nodes (containers) are back in good state with new leader. But when I login to one of the container and go inside zk Cli (./zkCli.sh) and run the cmd ls / I see below error,
       
      [zk: localhost:2181(CONNECTING) 1] 

      [zk: localhost:2181(CONNECTING) 1] ls /

      2020-05-14 23:48:26,556 [myid:localhost:2181] - WARN  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1229] - Client session timed out, have not heard from server in 30001ms for session id 0x0

      2020-05-14 23:48:26,556 [myid:localhost:2181] - WARN  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1272] - Session 0x0 for sever localhost/127.0.0.1:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.

      org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not heard from server in 30001ms for session id 0x0

      at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1230)

      KeeperErrorCode = ConnectionLoss for /

      [zk: localhost:2181(CONNECTING) 2] 2020-05-14 23:48:28,089 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1154] - Opening socket connection to server localhost/127.0.0.1:2181.

      2020-05-14 23:48:28,089 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1156] - SASL config status: Will not attempt to authenticate using SASL (unknown error)

      2020-05-14 23:48:28,090 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@986] - Socket connection established, initiating session, client: /127.0.0.1:60384, server: localhost/127.0.0.1:2181

      2020-05-14 23:48:58,119 [myid:localhost:2181] - WARN  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1229] - Client session timed out, have not heard from server in 30030ms for session id 0x0

      2020-05-14 23:48:58,120 [myid:localhost:2181] - WARN  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1272] - Session 0x0 for sever localhost/127.0.0.1:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.

      org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not heard from server in 30030ms for session id 0x0

      at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1230)

      2020-05-14 23:49:00,003 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1154] - Opening socket connection to server localhost/127.0.0.1:2181.

      2020-05-14 23:49:00,004 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1156] - SASL config status: Will not attempt to authenticate using SASL (unknown error)

      2020-05-14 23:49:00,004 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@986] - Socket connection established, initiating session, client: /127.0.0.1:32936, server: localhost/127.0.0.1:2181

      2020-05-14 23:49:30,032 [myid:localhost:2181] - WARN  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1229] - Client session timed out, have not heard from server in 30029ms for session id 0x0

      2020-05-14 23:49:30,033 [myid:localhost:2181] - WARN  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1272] - Session 0x0 for sever localhost/127.0.0.1:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.

      org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not heard from server in 30029ms for session id 0x0

      at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1230)

      2020-05-14 23:49:31,230 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1154] - Opening socket connection to server localhost/127.0.0.1:2181.

      2020-05-14 23:49:31,230 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1156] - SASL config status: Will not attempt to authenticate using SASL (unknown error)

      2020-05-14 23:49:31,230 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@986] - Socket connection established, initiating session, client: /127.0.0.1:33766, server: localhost/127.0.0.1:2181

      Does anyone know what could possibly be wrong? For reference: https://issues.apache.org/jira/browse/ZOOKEEPER-2164

      This behavior is observed on all the nodes when the leader is restarted. All is good when a follower is restarted.

      Attachments

        1. debug_logs.zip
          2.29 MB
          Aishwarya Soni
        2. node1.txt
          8.22 MB
          Aishwarya Soni
        3. node2.txt
          4.20 MB
          Aishwarya Soni
        4. node3.txt
          5.19 MB
          Aishwarya Soni
        5. node4.txt
          9.45 MB
          Aishwarya Soni
        6. node5.txt
          6.98 MB
          Aishwarya Soni

        Issue Links

          Activity

            People

              Unassigned Unassigned
              aishwaryasoni1991 Aishwarya Soni
              Votes:
              9 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated: