Details
Description
I have a zookeeper cluster with 5 servers, zookeeper version 3.2.1, here is the content in the configure file, zoo.cfg
======
- The number of milliseconds of each tick
tickTime=2000 - The number of ticks that the initial
- synchronization phase can take
initLimit=5 - The number of ticks that can pass between
- sending a request and getting an acknowledgement
syncLimit=2 - the directory where the snapshot is stored.
dataDir=./data/ - the port at which the clients will connect
clientPort=8181
- zookeeper cluster list
server.100=10.23.253.43:8887:8888
server.101=10.23.150.29:8887:8888
server.102=10.23.247.141:8887:8888
server.200=10.65.20.68:8887:8888
server.201=10.65.27.21:8887:8888
=====
Before the problem happened, the server.200 was the leader. Yesterday morning, I found the there were many sockets with the state of CLOSE_WAIT on the clientPort (8181), the total was over about 120. Because of these CLOSE_WAIT, the server.200 could not accept more connections from the clients. The only thing I can do under this situation is restart the server.200, at about 2010-02-01 06:06:35. The related log is attached to the issue.