Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.0.0
-
None
-
None
Description
I regularly see the executors web ui showing more active tasks then it has cores. Looking at the code it seems that we track those separately and the message that is sent for task end is asynchronous and thus ends up showing up at the UI much later then the start event.
CoarseGrainedSchedulerBackend on statusUpdate increases the freeCores which then allow scheduler to assign another task, but the taskEndEvent is asynchronous.
We definitely don't want to slow down the scheduling part so not sure how easily it will be to improve.
To reproduce I just ran:
val df = sc.makeRDD(1 to 10000000, 6).toDF
val df2 = sc.makeRDD(1 to 10000000, 6).toDF
spark.time(df.select( $"value" as "a").join(df2.select($"value" as "b"), $"a" === $"b").write.mode("overwrite").csv("somefile"))
And view the executors ui page. I started spark-shell with just 1 core per executor and you see 2 active tasks