Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.6.0
Description
After changing to close the socket asynchronously, the shutdown process can proceed while the socket is being closed. However, the shutdown process could still stall if a thread being shutdown is writing to the socket. For example, the SyncRequestProcessor flushes all ACK packets in queue when shutdown is called, which calls Learner.writePacket(), which will not return (with an IO exception) until the socket finishes closing. So it's still delayed by the socket closing time.
To get around the delay, we move Learner.writePacket() to a separate thread. The tricky part is to handle the IO exception thrown by Learner.writePacket(). Currently, the IO exception is caught by different callers in different ways. For example, if an IO exception caught during revalidateSession, the session is closed and removed. In other cases, like in FollowerRequestProcessor and SendAckRequestProcess, the quorum socket is closed when the IO exception is caught. After moving it to a thread, the callers won't be able to catch and handle the exception. We need to handle it within the sending function. We reason that if an IO exception is thrown on the quorum socket of a follower, it only makes sense to shut down the server. So we make the sending thread a ZooKeeperCriticalThread.
Attachments
Issue Links
- is required by
-
ZOOKEEPER-4074 Network issue while Learner is executing writePacket can cause the follower to hang
- Open
- links to