Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
4.0.0
-
None
Description
【sql】
// code placeholder SELECT /*+ SHUFFLE_MERGE(b) */ s_date, sum(s_quantity * i_price) AS total_sales FROM sales a JOIN items b ON s_item_id = i_item_id WHERE i_price < 10 GROUP BY s_date with rollup;
Set spark.sql.shuffle.partitions=1000
After aqe:
The parallel reads in the ExpandExecut phase have been adjusted to 71, reducing parallelism. The ExpandExecut phase can lead to data expansion, and a decrease in parallelism can result in longer task execution times.
If AGE is turned off as a whole, AQE optimization cannot be enjoyed in other stages. If it is found that ExpandExec is included in the current stage, partition merging will not be performed for this issue.