ZOOKEEPER-2111 introduced synchronized(state) statements in ClientCnxn and ClientCnxn.SendThread to coordinate insertion in outgoingQueue and draining it when the client connection isn't alive.
There are several issues with this approach:
- the value of the state field is not stable, meaning we don't always synchronize on the same object.
- the state field is an enum value, which are global objects. So in an application with several ZooKeeper clients connected to different servers, this causes some contention between clients.
An easy fix is change those synchronized(state) statements to synchronized(outgoingQueue) since it is local to each client and is what we want to coordinate.
I'll be happy to prepare a PR with the above change if this is deemed to be the correct way to fix it.
Another issue that makes contention worse is ClientCnxnSocketNIO.cleanup() that is called from within the above synchronized block and contains Thread.sleep(100). Why is this sleep statement needed, and can we remove it?