Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-43952

Cancel Spark jobs not only by a single "jobgroup", but allow multiple "job tags"

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.5.0
    • 3.5.0
    • Spark Core
    • None

    Description

      Currently, the only way to cancel running Spark Jobs is by using SparkContext.cancelJobGroup, using a job group name that was previously set using SparkContext.setJobGroup. This is problematic if multiple different parts of the system want to do cancellation, and set their own ids.

      For example, https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala#L133 sets it's own job group, which may override job group set by user. This way, if user cancels the job group they set, it will not cancel these broadcast jobs launches from within their jobs...

      As a solution, consider adding SparkContext.addJobTag / SparkContext.removeJobTag, which would allow to have multiple "tags" on the jobs, and introduce SparkContext.cancelJobsByTag to allow more flexible cancelling of jobs.

      Attachments

        Activity

          People

            juliuszsompolski Juliusz Sompolski
            juliuszsompolski Juliusz Sompolski
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: