Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
In LightWeightCache#evictExpiredEntries(), the precondition check can fail. patwhitey2007 ran a HA failover test and it occurred while the SBN was catching up with edits during a transition to active. This caused NN to terminate.
Here is my theory: If an RPC handler calls waitForCompletion() and it happens to remove the head of the queue in get(), it will race with evictExpiredEntries() frrom put().