Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20904

Task failures during shutdown cause problems with preempted executors

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.0
    • Fix Version/s: 2.2.1, 2.3.0
    • Component/s: Spark Core, YARN
    • Labels:
      None

      Description

      Spark runs tasks in a thread pool that uses daemon threads in each executor. That means that when the JVM gets a signal to shut down, those tasks keep running.

      Now when YARN preempts an executor, it sends a SIGTERM to the process, triggering the JVM shutdown. That causes shutdown hooks to run which may cause user code running in those tasks to fail, and report task failures to the driver. Those failures are then counted towards the maximum number of allowed failures, even though in this case we don't want that because the executor was preempted.

      So we need a better way to handle that situation.

        Attachments

          Activity

            People

            • Assignee:
              vanzin Marcelo Masiero Vanzin
              Reporter:
              vanzin Marcelo Masiero Vanzin
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: