Details
-
Sub-task
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.2.1
-
None
-
None
Description
As discussed in the thread in SPARK-37023, during a stage retry, if parent stage has already generated merged output in the previous attempt, with current behavior, the child stage would not able to fetch the merged output, as this is controlled by dependency.shuffleMergeEnabled (see current implementation here) during the stage retry.
Instead of using a single variable to control behavior at both mapper side (push side) and reducer side (using merged output), whether child stage uses merged output or not must only be based on whether merged output is available for it to use(as discussed here).