Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-50186

Remove Hardcoded OnOutOfMemoryError Setting in Executor Startup Script

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.3.1, 4.0.0
    • None
    • Spark Core
    • None

    Description

      Currently, the Executor startup script hardcodes the addition of -XX:OnOutOfMemoryError='kill %p', which causes the process to be killed when the Executor encounters an OOM (Out Of Memory) error.

      // code in YarnSparkHadoopUtil
      private[yarn] def addOutOfMemoryErrorArgument(javaOpts: ListBuffer[String]): Unit = {
        if (!javaOpts.exists(_.contains("-XX:OnOutOfMemoryError"))) {
          if (Utils.isWindows) {
            javaOpts += escapeForShell("-XX:OnOutOfMemoryError=taskkill /F /PID %%%%p")
          } else {
            javaOpts += "-XX:OnOutOfMemoryError='kill %p'"
          }
        }
      }

      As a result, the YarnAllocator receives an exit code of 143 and is unable to accurately determine the reason for the Executor's termination based on this exit code. Moreover, the CoarseGrainedExecutorBackend cannot guarantee that StatusUpdate messages are sent to the Driver before the process is killed.
      Could we remove this setting, since users can set it via the spark.executor.extraJavaOptions parameter if necessary?

      Executor log:

       
      Driver log:

      Attachments

        1. image-2024-10-31-14-17-51-349.png
          64 kB
          Ruochen Zou
        2. image-2024-10-31-14-14-06-723.png
          264 kB
          Ruochen Zou

        Activity

          People

            Unassigned Unassigned
            zortsou Ruochen Zou
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: