Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.3.1, 4.0.0
-
None
-
None
Description
Currently, the Executor startup script hardcodes the addition of -XX:OnOutOfMemoryError='kill %p', which causes the process to be killed when the Executor encounters an OOM (Out Of Memory) error.
// code in YarnSparkHadoopUtil private[yarn] def addOutOfMemoryErrorArgument(javaOpts: ListBuffer[String]): Unit = { if (!javaOpts.exists(_.contains("-XX:OnOutOfMemoryError"))) { if (Utils.isWindows) { javaOpts += escapeForShell("-XX:OnOutOfMemoryError=taskkill /F /PID %%%%p") } else { javaOpts += "-XX:OnOutOfMemoryError='kill %p'" } } }
As a result, the YarnAllocator receives an exit code of 143 and is unable to accurately determine the reason for the Executor's termination based on this exit code. Moreover, the CoarseGrainedExecutorBackend cannot guarantee that StatusUpdate messages are sent to the Driver before the process is killed.
Could we remove this setting, since users can set it via the spark.executor.extraJavaOptions parameter if necessary?
Executor log:
Driver log: