Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-4383

Delay scheduling doesn't work right when jobs have tasks with different locality levels

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.2, 1.1.0
    • 1.3.0
    • Scheduler, Spark Core
    • None

    Description

      Copied from mailing list discussion:

      Now our application will load data from hdfs in the same spark cluster, it will get NODE_LOCAL and RACK_LOCAL level tasks during loading stage, if the tasks in loading stage have same locality level, ether NODE_LOCAL or RACK_LOCAL it works fine.
      But if the tasks in loading stage get mixed locality level, such as 3 NODE_LOCAL tasks, and 2 RACK_LOCAL tasks, then the TaskSetManager of loading stage will submit the 3 NODE_LOCAL tasks as soon as resources were offered, then wait for spark.locality.wait.node, which was set to 30 minutes, the 2 RACK_LOCAL tasks will wait 30 minutes even though resources are available.

      Fixing this is quite tricky – do we need to track the locality level individually for each task?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kayousterhout Kay Ousterhout
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: