Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-4653 DAGScheduler refactoring and cleanup
  3. SPARK-20116

Remove task-level functionality from the DAGScheduler

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.2.0
    • None
    • Scheduler, Spark Core

    Description

      Long, long ago, the scheduler code was more modular, and the DAGScheduler handled the logic of scheduling DAGs of stages (as the name suggests) and the TaskSchedulerImpl handled scheduling the tasks within a stage. Over time, more and more task-specific functionality has been added to the DAGScheduler, and now, the DAGScheduler duplicates a bunch of the task tracking that's done by other scheduler components. This makes the scheduler code harder to reason about, and has led to some tricky bugs (e.g., SPARK-19263). We should move all of this functionality back to the TaskSchedulerImpl and TaskSetManager, which should "hide" that complexity from the DAGScheduler.

      Attachments

        Activity

          People

            kayousterhout Kay Ousterhout
            kayousterhout Kay Ousterhout
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: