Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.0.7
-
None
-
None
Description
Every once in a while (in production only, of course), a couple of threads will go off into the weeds, consuming 100% CPU, and never come back. They're sitting in an infinite loop that looks something like this:
- We're chugging merrily along in the main processing loop (AbstractPollingIoProcessor.java:1070)
- Decide to flush the single session in flushingSessions (AbstractPollingIoProcessor.java:773, :1129)
- According to "SessionState state = getState(session)", the session is OPENED, which I think is a lie, and perhaps the root of the problem.
- Enter "flushNow" (AbstractPollingIoProcessor.java:821, :789, :1129)
- Begin processing a queued message (AbstractPollingIoProcessor.java:861, :789, :1129)
- Try to write out the message in writeBuffer (AbstractPollingIoProcessor.java:931, :861, :789, :1129)
- Catch an IOException ("Broken pipe") in writeBuffer (AbstractPollingIoProcessor.java:935, :861, :789, :1129), call "session.close(true)"
- Next time around the loop at (AbstractPollingIoProcessor.java:1070), the session is put back in flushingSessions, because apparently the session is still writable (liar!) (AbstractPollingIoProcessor.java:671, :653, :1124)
- Repeat, Ad Infinitum!
The suggested fix, courtesy Emmanuel Lécharny, is:
add this line in the AbstractPollingIoProcessor
class, line 927 :
try { localWrittenBytes = write(session, buf, length); } catch (IOException ioe) { // We have had an issue while trying to send data to the // peer : let's close the session. buf.free(); session.close(true); destroy(session); // <<<<<<<<<<<<<<<<<---- This line return 0; }