Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-48547

Add opt-in flag to have SparkSubmit automatically call System.exit after user code main method exits



    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • Deploy


      This PR proposes to add a new flag, `spark.submit.callSystemExitOnMainExit` (default false), which when true will instruct SparkSubmit to call System.exit() in the JVM once the user code's main method has exited (for Java / Scala jobs) or once the user's Python or R script has exited.

      This is intended to address a longstanding issue where SparkSubmit invocations might hang after user code has completed:

      According to Java’s java.lang.Runtime docs:

      The Java Virtual Machine initiates the shutdown sequence in response to one of several events:

      1. when the number of live non-daemon threads drops to zero for the first time (see note below on the JNI Invocation API);
      1. when the Runtime.exit or System.exit method is called for the first time; or
      1. when some external event occurs, such as an interrupt or a signal is received from the operating system.

      For Python and R programs, SparkSubmit’s PythonRunner and RRunner will call System.exit() if the user program exits with a non-zero exit code (see python and R runner code).

      But for Java and Scala programs, plus any successful R or Python programs, Spark will not automatically call System.exit.

      In those situation, the JVM will only shutdown when, via event (1), all non-daemon threads have exited (unless the job is cancelled and sent an external interrupt / kill signal, corresponding to event (3)).

      Thus, non-daemon threads might cause logically-completed spark-submit jobs to hang rather than completing.

      The non-daemon threads are not always under Spark's own control and may not necessarily be cleaned up by SparkContext.stop().

      Thus, it is useful to have an opt-in functionality to have SparkSubmit automatically call `System.exit()` upon main method exit (which usually, but not always, corresponds to job completion): this option will allow users and data platform operators to enforce System.exit() calls without having to modify individual jobs' code.


        Issue Links



              joshrosen Josh Rosen
              joshrosen Josh Rosen
              0 Vote for this issue
              1 Start watching this issue