Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.0
-
None
-
None
Description
We found a corner case with the new locality algorithm where we were not affectively resetting the locality when you started without executors. See https://github.com/apache/spark/pull/28656/
In that fix we reset the locality when a new executor is added and it changes the valid locality levels. This could reset it when you don't want to. for instance if the task set has been around for quite a while and you have other executors it could put tasks on. Resetting it back to the most local locality level could cause delays in the tasks getting scheduled that have been waiting for more then the locality wait.
This goes back to the new algorithm tries to use heuristic on this but its not perfect. Ideally I think we would still track per executor but that is a lot more change.
Investigate to see if we can handle this better.