Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Won't Do
-
None
-
None
Description
When a predicate fails for a given node, we print a message on DEBUG level. The problem is, we keep doing this in every scheduling cycle, flooding the logs:
2024-04-17T21:32:19.446Z DEBUG core.scheduler.node objects/node.go:403 running predicates failed {"allocationKey": "2be04314-bed0-4385-9ae7-50ed0ef9d9d5", "nodeID": "kwok-node-zzn7w", "allocateFlag": true, "error": "predicates were not running because pod or node was not found in cache"} 2024-04-17T21:33:24.417Z DEBUG core.scheduler.node objects/node.go:403 running predicates failed {"allocationKey": "2be04314-bed0-4385-9ae7-50ed0ef9d9d5", "nodeID": "kwok-node-zzn7w", "allocateFlag": true, "error": "predicates were not running because pod or node was not found in cache"} 2024-04-17T21:33:24.417Z DEBUG core.scheduler.node objects/node.go:403 running predicates failed {"allocationKey": "2be04314-bed0-4385-9ae7-50ed0ef9d9d5", "nodeID": "kwok-node-zzn7w", "allocateFlag": true, "error": "predicates were not running because pod or node was not found in cache"}
Another problematic part is preAllocateCheck() we have an allocation ask with a zero resource:
unc (sn *Node) preAllocateCheck(res *resources.Resource, resKey string) bool { // cannot allocate zero or negative resource if !resources.StrictlyGreaterThanZero(res) { log.Log(log.SchedNode).Debug("pre alloc check: requested resource is zero", zap.String("nodeID", sn.NodeID)) <-- will be printed from every node return false } ...
We need to reduce the amount of output with RateLimitedLogger.
Attachments
Issue Links
- is related to
-
YUNIKORN-2526 Discrepancy between shim cache and core app/task list after scheduler restart
- Open
- links to