Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
0.7.3, 0.8.0
-
None
Description
In ResultTask's serialization relative method: writeExternal and readExternal, they didn't do anything to generation.
But in ShuffleMapTask's method, writeExternal and readExternal, they do something like "partition = in.readInt()" and " out.writeLong(generation)" to them.
As we know ResultTask will be used after ShuffleMapTask, if right after ShuffleMapTask finish and the work failed for some reason, It will be recomputed, with a "generation" bigger than -1. The ResultTask can't get the right data again with default generation, that it will ask DAGScheduler to recompter ShuffleMapTask again. This will last until the whole job crash.
Attachments
Issue Links
- duplicates
-
SPARK-836 ResultTask's serialization forget to handle generation
- Resolved