Details
-
Improvement
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
3.4.0
-
None
-
None
Description
When do a global sort, firstly we do sample to get range bounds, then we use the range partitioner to do shuffle exchange.
The issue is, the sample plan is coupled with the shuffle plan that causes we can not optimize the sample plan. What we need for sample plan is the columns for sort order but the shuffle plan contains all data columns.So at least, we can do column pruning for the sample plan to only fetch the ordering columns.
A common example is: `OPTIMIZE table ZORDER BY columns`