Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.1.2, 3.2.0, 3.3.0
-
None
Description
-
- With FetchFailedException and Map Stage Retries
When rerunning spark-sql shell with the original SQL in https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315
1. stage 3 threw FetchFailedException and caused itself and its parent stage(stage 2) to retry
2. stage 2 was skipped before but its attemptId was still 0, so when its retry happened it got removed from `Skipped Stages`
The DAG of Job 2 doesn't show that stage 2 is skipped anymore.
Besides, a retried stage usually has a subset of tasks from the original stage. If we mark it as an original one, the metrics might lead us into pitfalls.