Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1740

Pyspark cancellation kills unrelated pyspark workers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.0.0
    • 1.1.0
    • PySpark
    • None

    Description

      PySpark cancellation calls SparkEnv#destroyPythonWorker. Since there is one python worker per process, this would seem like a sensible thing to do. Unfortunately, this method actually destroys a python daemon, and all associated workers, which generally means that we can cause failures in unrelated Pyspark jobs.

      The severity of this bug is limited by the fact that the Pyspark daemon is easily recreated, so the tasks will succeed after being restarted.

      Attachments

        Activity

          People

            davies Davies Liu
            ilikerps Aaron Davidson
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: