Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.0.0
Description
Currently OptimizeSkewedJoin will not apply if skew mitigation causes a new shuffle.
There are situations where it's better to mitigate skew even if it means a new shuffle is added, for example if the join outputs small amount of data.
As a first step I propose adding a SQLConf option to enable this.
I'll open a PR shortly to get feedback on the approach.
Attachments
Issue Links
- causes
-
SPARK-37328 SPARK-33832 brings the bug that OptimizeSkewedJoin may not work since it was applied on whole plan innstead of new stage plan
- Resolved
- links to
(4 links to)