[YUNIKORN-287] ask release can cause multiple reservations to be released - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.9
Component/s: core - scheduler
Labels:
- pull-request-available

Target Version:

0.9

Description

complex scenario with multiple things going on:

one application with multiple pending requests
two or more pending requests are reserved
one of those reserved pending requests is being allocated (scheduler is done, cache confirm is called async)
the request being allocated is cancelled by the shim in between the time the scheduler is done and the cache confirms

The cancellation of the shim triggers an update and the cache update triggers an update. These two updates cause counters for the number of reservations to be decremented twice.

The side effect is that the node that is reserved by the ask that is not removed will be skipped until that ask is allocated on a different node. If that takes a while (waiting for scale up for instance) then there will be an impact on scheduling.

Attachments

Issue Links

links to

GitHub Pull Request #161

GitHub Pull Request #185

Activity

People

Assignee:: Wilfred Spiegelenburg

Reporter:: Wilfred Spiegelenburg

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 16/Jul/20 12:27

Updated:: 21/Jan/22 21:47

Resolved:: 17/Jul/20 14:23