Description
I think ourĀ EliminateSorts rule can be extended further to remove sorts before repartition, repartitionByExpression and coalesce nodes. Independently of whether we do a shuffle or not, each repartition operation will change the ordering and distribution of data.
That's why we should be able to rewrite Repartition -> Sort -> Scan as Repartition -> Scan.
Attachments
Issue Links
- is related to
-
SPARK-32318 Add a test case to EliminateSortsSuite for protecting ORDER BY in DISTRIBUTE BY
- Resolved
- links to