Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.6.0
-
None
-
Reviewed
Description
We recently saw the RM for a large cluster lag far behind on the AsyncDispacher event queue. The AsyncDispatcher thread was consistently blocked on the highly-contended CapacityScheduler lock trying to dispatch preemption-related events for RMContainerPreemptEventDispatcher. Preemption processing should occur on the scheduler event dispatcher thread or a separate thread to avoid delaying the processing of other events in the primary dispatcher queue.