Details
Description
Reader threads can die to a race condition with the responder thread. If the server's ipc handler cannot send a response in one write, it delegates sending the rest of the response to the responder thread.
The race occurs when the responder thread has an exception writing to the socket. The responder closes the socket. This wakes up the reader polling on the socket. If a CancelledKeyException is thrown, which is a runtime exception, the reader dies. All connections serviced by that reader are now in limbo until the client possibly times out. New connections play roulette as to whether they are assigned to a defunct reader.
Attachments
Attachments
Issue Links
- duplicates
-
HADOOP-13657 IPC Reader thread could silently die and leave NameNode unresponsive
- Resolved