Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
In some k8s env, namespace is enabled with LimitRange:
apiVersion: v1 kind: LimitRange metadata: name: default spec: limits: - default: cpu: 100m ephemeral-storage: 10Gi memory: 128Mi defaultRequest: cpu: 100m ephemeral-storage: 1Gi memory: 128Mi type: Container
However, shim code spawns placeholder pods with requests only, which cause the placeholder pods rejected by API server, and hence the whole Job gets into pending:
2023-09-29T13:19:17.851-0700 ERROR cache/placeholder_manager.go:99 failed to create placeholder pod {"error": "Pod \"tg-foo-task-group-foo-gang-0\" is invalid: spec.containers[0].resources.requests: Invalid value: \"1\": must be less than or equal to cpu limit"}
I'd propose setting the same amount of request to the placeholder Pod's limits, so system's LimitRange won't auto-set lower limits value.
Attachments
Issue Links
- Blocked
-
YUNIKORN-1908 Support GPU in placeholders
- Closed
- fixes
-
YUNIKORN-2682 YuniKorn Gang Scheduling Issue: Executors Failing to Start When Running Multiple Applications
- Closed
- links to