Details
-
Bug
-
Status: Open
-
Not a Priority
-
Resolution: Unresolved
-
1.6.3
-
None
Description
I have added taskmanager.jvm-exit-on-oom flag to the task manager starting arguments. During my testing (simulating oom) I noticed that sometimes YARN containers were still in RUNNING state even though they should haven been killed on OutOfMemory errors with the flag on.
I could find RUNNING containers with the last log lines like this.
2019-07-26 13:32:51,396 ERROR org.apache.flink.runtime.taskmanager.Task - Encountered fatal error java.lang.OutOfMemoryError - terminating the JVM java.lang.OutOfMemoryError: Metaspace at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
Does YARN make it tricky to forcefully kill JVM after OutOfMemory error?
Workaround
When using -XX:+ExitOnOutOfMemoryError JVM flag containers get always terminated!