Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
None
Description
As reported in YUNIKORN-1996, we are seeing many messages like below from time to time:
WARN objects/application.go:1504 queue update failed unexpectedly {“error”: “allocation (map[memory:37580963840 pods:1 vcore:2000]) puts queue ‘root.test-queue’ over maximum allocation (map[memory:3300011278336 vcore:390584]), current usage (map[memory:3291983380480 pods:91 vcore:186000])“}
Restarting Yunikorn helps stoppinging it. Creating this Jira to investigate why it happened, because it's not supposed to happen as we check if there is enough resource headroom before calling
func (sa *Application) tryNode(node *Node, ask *AllocationAsk) *Allocation
which printed the above message, and only call it when there is enough headroom.
There maybe a bug in headroom checking?
Attachments
Issue Links
- links to