Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18553

Executor loss may cause TaskSetManager to be leaked

    XMLWordPrintableJSON

    Details

      Description

      Due to a bug in TaskSchedulerImpl, the complete sudden loss of an executor may cause a TaskSetManager to be leaked, causing ShuffleDependencies and other data structures to be kept alive indefinitely, leading to various types of resource leaks (including shuffle file leaks).

      In a nutshell, the problem is that TaskSchedulerImpl did not maintain its own mapping from executorId to running task ids, leaving it unable to clean up taskId to taskSetManager maps when an executor is totally lost.

        Attachments

          Activity

            People

            • Assignee:
              joshrosen Josh Rosen
              Reporter:
              joshrosen Josh Rosen
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: