[YUNIKORN-2370] Proper event handling for failed headroom checks - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.5.0
Component/s: core - scheduler
Labels:
- pull-request-available

Target Version:

1.5.0

Description

Currently, we have this code inside Application.tryAllocate() (some lines removed for clarity):

func (sa *Application) tryAllocate(headRoom *resources.Resource, allowPreemption bool, preemptionDelay time.Duration, preemptAttemptsRemaining *int, nodeIterator func() NodeIterator, fullNodeIterator func() NodeIterator, getNodeFn func(string) *Node) *Allocation {
        ...
	userHeadroom := ugm.GetUserManager().Headroom(sa.queuePath, sa.ApplicationID, sa.user)
	// get all the requests from the app sorted in order
	for _, request := range sa.sortedRequests {
		...
		if !userHeadroom.FitInMaxUndef(request.GetAllocatedResource()) {
			continue
		}

		// resource must fit in headroom otherwise skip the request (unless preemption could help)
		if !headRoom.FitInMaxUndef(request.GetAllocatedResource()) {
			// attempt preemption
			if allowPreemption && *preemptAttemptsRemaining > 0 {
				...
			}
			sa.appEvents.sendAppDoesNotFitEvent(request, headRoom)   <--- event
			continue
		}

There are issues with this approach:
1. We say "the application doesn't fit" while it's really the request that doesn't fit.
2. If there's no quota at all, then a request gets its own event, but the rest don't.

Suggested approach:
1. Have a per-request event
2. When an event is sent (eg. failed user headroom) for a given request, remember it and don't send it anymore

Attachments

Issue Links

relates to

YUNIKORN-2371 Add failed headroom checks to the allocation log

Closed

links to

GitHub Pull Request #784

Activity

People

Assignee:: Peter Bacsko

Reporter:: Peter Bacsko

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 31/Jan/24 09:46

Updated:: 20/Mar/24 14:32

Resolved:: 07/Feb/24 11:00