Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.1.1
-
None
Description
When running speculative tasks you can end up getting a task failure on a speculative task (the other task succeeded) because that task got a CommitDenied exception when really it was "killed" by the driver. It is a race between when the driver kills and when the executor tries to commit.
I think ideally we should fix up the task state on this to be killed because the fact that this task failed doesn't matter since the other speculative task succeeded. tasks showing up as failure confuse the user and could make other scheduler cases harder.
This is somewhat related to SPARK-13343 where I think we should be correctly account for speculative tasks. only one of the 2 tasks really succeeded and commited, and the other should be marked differently.