Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
The lack of Project under LogicalWindow hurts the performance.
Firstly of all, this issue happens when HepPlanner is used with ProjectToWindowRule.PROJECT rule.
A simple query like:
select sum(deptno) over(partition by deptno) as sum1 from emp
produces
LogicalProject($0=[$9]) LogicalWindow(window#0=[window(partition {7} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($7)])]) LogicalTableScan(table=[[CATALOG, SALES, EMP]])
However, from performance standpoint, it is better to have a project between LogicalWindow and LogicalTableScan since only one column is used. Interestingly, when there is an expression in the window function. For example,
select sum(deptno + 1) over(partition by deptno) as sum1 from emp"
produces
LogicalProject($0=[$2]) LogicalWindow(window#0=[window(partition {0} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])]) LogicalProject(DEPTNO=[$7], $1=[+($7, 1)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]])
The LogicalProject below window can trim out useless columns or even be pushed into Scan, which is very important optimization Calcite can exploit.