Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
Recently we encountered several gang scheduling errors in CI e2e test, all of the failures are waiting for the creation of placeholders(with 10M memory limit). However, some placeholders are failed with below OOM-killed error:
“Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown”
The root cause might be the varying memory peak when OCI runtime create multiple containers. We can try to change placeholder memory limit from 10M to 20M in e2e test. (Sleep jobs are using 20M memory.)
List some failed e2e test in last 3 weeks:
Attachments
Attachments
Issue Links
- links to