Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Won't Fix
-
None
-
None
Description
When applying multiple GROUP BY, and no aggregates or expression in the first GROUP BY, and the second GROUP fields is a subset of first GROUP fields. Then the first GROUP BY can be removed.
Such as the following SQL ,
SELECT a FROM (SELECT a,b,c FROM MyTable GROUP BY a, b, c) GROUP BY a
should be optimized into
DataStreamGroupAggregate(groupBy=[a], select=[a]) DataStreamCalc(select=[a]) DataStreamScan(table=[[_DataStreamTable_0]])
but get:
DataStreamGroupAggregate(groupBy=[a], select=[a]) DataStreamCalc(select=[a]) DataStreamGroupAggregate(groupBy=[a, b, c], select=[a, b, c]) DataStreamScan(table=[[_DataStreamTable_0]])
I looked for the Calcite built-in rules, but can't find a match one. So maybe we should implement one , and maybe we should implement it in Calcite.