Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
ShuffleOrigin was introduced to determine whether the shuffle can be adjusted or not, and if it can be, in which way.
ENSURE_REQUIREMENTS used by the rule `EnsureRequirements` denotes that the shuffle can be adjusted as long as the requirements are still ensured. Here the requirements do not include the number of partitions so AQE can possibly optimize the number (assuming that the number of partitions isn't a strong requirement), but for stateful operators, the number of partitions is a strong requirement and it must not be adjusted.
To prevent any further modification of AQE to break stateful operator (as same as the reason we introduced StatefulOpClusteredDistribution), we want to explicitly mark the shuffle backing up stateful operator as different ShuffleOrigin.
Attachments
Issue Links
- links to