[SPARK-18967] Locality preferences should be used when scheduling even when delay scheduling is turned off - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.1.0
Fix Version/s: 2.2.0
Component/s: Scheduler, Spark Core
Labels:
None

Description

If you turn delay scheduling off by setting spark.locality.wait=0, you effectively turn off the use the of locality preferences when there is a bulk scheduling event. TaskSchedulerImpl will use resources based on whatever random order it decides to shuffle them, rather than taking advantage of the most local options.

This happens because TaskSchedulerImpl offers resources to a TaskSetManager one at a time, each time subject to a maxLocality constraint. However, that constraint doesn't move through all possible locality levels – it uses tsm.myLocalityLevels . And tsm.myLocalityLevels skips locality levels completely if the wait == 0 . So with delay scheduling off, TaskSchedulerImpl immediately jumps to giving tsms the offers with maxLocality = ANY.

WORKAROUND: instead of setting spark.locality.wait=0, use spark.locality.wait=1ms. The one downside of this is if you have tasks that actually take less than 1ms. You could even run into ~~SPARK-18886~~. But that is a relatively unlikely scenario for real workloads.

Attachments

Issue Links

links to

[Github] Pull Request #16376 (squito)

Activity

People

Assignee:: Imran Rashid

Reporter:: Imran Rashid

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 21/Dec/16 18:23

Updated:: 17/May/20 17:48

Resolved:: 07/Feb/17 06:38