Description
StagePage will be hung-up with following operations.
1. Run a job with shuffle.
scala> sc.parallelize(1 to 10).map(x => (x, x)).reduceByKey(_ + _).collect
2. Visit StagePage for the stage writing shuffle data and check `Shuffle Write Time`.
3. Run a job with no shuffle.
scala> sc.parallelize(1 to 10).collect
4. Visit StagePage for the last stage.
This issue is caused by following reason.
In stagepage.js, an array `optionalColumns` has indices for columns for optional metrics.
If a stage doesn't perform shuffle read or write, the corresponding indices are removed from the array.
StagePage doesn't try to create column for such metrics, even if the state of corresponding optional metrics are preserved as "visible".
But, if a stage doesn't perform both shuffle read and write, the index for `Shuffle Write Time` isn't removed.
In that case, StagePage tries to create a column for `Shuffle Write Time` even though there are no metrics for shuffle write, leading hungup.
Attachments
Issue Links
- relates to
-
SPARK-31073 Add "shuffle write time" to task metrics summary in StagePage.
- Resolved
- links to