Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.1.0-Ducc
-
None
Description
Original DUCC design said that non-preemptable shares could never be deallocated by RM unless the owner canceled the work running in those shares, even if the underlying system fails.
Since then, job-jobs, services, and "managed reservations" have been defined to run in non-preemptable shares.
If RM detects node failure it should now include non-preemptable shares in the reaper, as long as the shares are occupied by job-jobs, services, and MRs. "Unmanaged"-reservations continue to remain allocated forever, or until the owner releases them.