Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.4.4
-
None
-
None
-
None
Description
In our 5 node zk cluster, we found a zk node always can not be connected. From the stack we found the ZooKeeperServer hung at waiting the server to be running. But the node is running normally and synced with the leader.
$ ./zkCli.sh -server 10.101.10.67:11000 ls / 2014-11-27 20:57:11,843 [myid:] - WARN [main-SendThread(lg-com-master02.bj:11000):ClientCnxn$SendThread@1089] - Session 0x0 for server lg-com-master02.bj/10.101.10.67:11000, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:353) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for / at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1469) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1497) at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:726) at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:594) at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:355) at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:283)
ZooKeeperServer stack
"NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11000" daemon prio=10 tid=0x00007f60143f7800 nid=0x31fd in Object.wait() [0x00007f5fd4678000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.zookeeper.server.ZooKeeperServer.submitRequest(ZooKeeperServer.java:634) - locked <0x00000007602756a0> (a org.apache.zookeeper.server.quorum.FollowerZooKeeperServer) at org.apache.zookeeper.server.ZooKeeperServer.submitRequest(ZooKeeperServer.java:626) at org.apache.zookeeper.server.ZooKeeperServer.createSession(ZooKeeperServer.java:525) at org.apache.zookeeper.server.ZooKeeperServer.processConnectRequest(ZooKeeperServer.java:841) at org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest(NIOServerCnxn.java:410) at org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:200) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:236) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:662)
Any suggestions about this problem? Thanks.