Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
QuorumJournalManager (HDFS-3077)
-
None
-
Reviewed
Description
Currently, if a logger misses an RPC in the middle of a log segment, or misses the startLogSegment RPC (eg it was down or network was disconnected during that time period), then it will throw an exception on every subsequent journal() call in that segment, since it knows that it missed some edits in the middle.
We should change this exception to a specific IOE subclass, and have the client side of QJM detect the situation and stop sending IPCs until the next startLogSegment call.
This isn't critical for correctness but will help reduce log spew on both sides.