Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.7.0, 2.6.1
-
Reviewed
Description
In SchedulingMonitor.java, when the service starts, it starts a checker thread to perform Capacity Scheduler's preemption. However, the implementation of this checker thread has the following issue:
while (!stopped && !Thread.currentThread().isInterrupted()) { .... try { Thread.sleep(monitorInterval) } catch (InterruptedException e) { .... break; } }
The above code snippet will terminate the checker thread whenever it is interrupted.
We noticed in our cluster that this could lead to CapacityScheduler's preemption disabled unexpectedly due to the checker thread getting terminated.
We propose to use ScheduledExecutorService to improve the robustness of this part of the code to ensure the liveness of CapacityScheduler's preemption functionality.
Attachments
Attachments
Issue Links
- breaks
-
YARN-7084 TestSchedulingMonitor#testRMStarts fails sporadically
- Resolved