Status: Open
Resolution: Unresolved
3.5.5, 3.5.7, 3.6.1, 3.5.8
I have configured 5 nodes zookeeper cluster using 3.6.1 version in a docker containerized environment. As a part of some destructive testing, I restarted zookeeper leader. Now, re-election happened and all 5 nodes (containers) are back in good state with new leader. But when I login to one of the container and go inside zk Cli (./ and run the cmd ls / I see below error,
[zk: localhost:2181(CONNECTING) 1]
[zk: localhost:2181(CONNECTING) 1] ls /
2020-05-14 23:48:26,556 [myid:localhost:2181] - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1229] - Client session timed out, have not heard from server in 30001ms for session id 0x0
2020-05-14 23:48:26,556 [myid:localhost:2181] - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1272] - Session 0x0 for sever localhost/, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not heard from server in 30001ms for session id 0x0
at org.apache.zookeeper.ClientCnxn$
KeeperErrorCode = ConnectionLoss for /
[zk: localhost:2181(CONNECTING) 2] 2020-05-14 23:48:28,089 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1154] - Opening socket connection to server localhost/
2020-05-14 23:48:28,089 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1156] - SASL config status: Will not attempt to authenticate using SASL (unknown error)
2020-05-14 23:48:28,090 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@986] - Socket connection established, initiating session, client: /, server: localhost/
2020-05-14 23:48:58,119 [myid:localhost:2181] - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1229] - Client session timed out, have not heard from server in 30030ms for session id 0x0
2020-05-14 23:48:58,120 [myid:localhost:2181] - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1272] - Session 0x0 for sever localhost/, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not heard from server in 30030ms for session id 0x0
at org.apache.zookeeper.ClientCnxn$
2020-05-14 23:49:00,003 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1154] - Opening socket connection to server localhost/
2020-05-14 23:49:00,004 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1156] - SASL config status: Will not attempt to authenticate using SASL (unknown error)
2020-05-14 23:49:00,004 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@986] - Socket connection established, initiating session, client: /, server: localhost/
2020-05-14 23:49:30,032 [myid:localhost:2181] - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1229] - Client session timed out, have not heard from server in 30029ms for session id 0x0
2020-05-14 23:49:30,033 [myid:localhost:2181] - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1272] - Session 0x0 for sever localhost/, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not heard from server in 30029ms for session id 0x0
at org.apache.zookeeper.ClientCnxn$
2020-05-14 23:49:31,230 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1154] - Opening socket connection to server localhost/
2020-05-14 23:49:31,230 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1156] - SASL config status: Will not attempt to authenticate using SASL (unknown error)
2020-05-14 23:49:31,230 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@986] - Socket connection established, initiating session, client: /, server: localhost/
Does anyone know what could possibly be wrong? For reference:
This behavior is observed on all the nodes when the leader is restarted. All is good when a follower is restarted.
Issue Links
- relates to
ZOOKEEPER-3466 ZK cluster converges, but does not properly handle client connections (new in 3.5.5)
- Open
ZOOKEEPER-3920 Zookeeper clients timeout after leader change due to address when in docker environment
- Closed