Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.9.1
Description
Many tpc-ds queries have rollup keyword, which will be translated to multiple groups.
for example: group by rollup (channel, id) is equivalent group by (channel, id) + group by (channel) + group by ().
All data on empty group will be shuffled to a single node, It is a typical data skew case. If there is a local aggregate, the data size shuffled to the single node will be greatly reduced. However, currently the cost mode can't estimate the local aggregate's cost, and the plan with local aggregate may be chose even the query has rollup keyword.
we could add a rule based phase (after physical phase) to enforce local aggregate if it's input has empty group.
Attachments
Issue Links
- links to