Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
PR #250 for the placeholder cleanup introduced a possible dead lock.
When moving a terminated app from the queue the queue is unlinked from the app. Before that happens we make sure that the queue tracking is up to date. This requires an application lock.
The move used to take a lock on the partition and did all its work in one go. With the change to remove the queue from the app we now lock the app inside the partition lock. Since the app is part of the active list until the move is done scheduling might check the app too. This could lead to a dead lock.
Attachments
Issue Links
- is related to
-
YUNIKORN-460 Handle app reservation timeout
- Closed
-
YUNIKORN-519 Cleanup placeholders when the app is Completed
- Closed
- links to