Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
QuorumJournalManager (HDFS-3077)
-
None
-
Reviewed
Description
TestQJMWithFaults.testRecoverAfterDoubleFailures fails really occasionally. Looking into it, the issue seems to be that it's possible by random chance for an IPC server port to be reused between two different iterations of the test loop. The client will then pick up and re-use the existing IPC connection to the old server. However, the old server was shut down and restarted, so the old IPC connection is stale (ie disconnected). This causes the new client to get an EOF when it sends the "format()" call.