-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 1.5.0
-
Component/s: None
-
Labels:None
The lack of Project under LogicalWindow hurts the performance.
Firstly of all, this issue happens when HepPlanner is used with ProjectToWindowRule.PROJECT rule.
A simple query like:
select sum(deptno) over(partition by deptno) as sum1 from emp
produces
LogicalProject($0=[$9]) LogicalWindow(window#0=[window(partition {7} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($7)])]) LogicalTableScan(table=[[CATALOG, SALES, EMP]])
However, from performance standpoint, it is better to have a project between LogicalWindow and LogicalTableScan since only one column is used. Interestingly, when there is an expression in the window function. For example,
select sum(deptno + 1) over(partition by deptno) as sum1 from emp"
produces
LogicalProject($0=[$2]) LogicalWindow(window#0=[window(partition {0} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])]) LogicalProject(DEPTNO=[$7], $1=[+($7, 1)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]])
The LogicalProject below window can trim out useless columns or even be pushed into Scan, which is very important optimization Calcite can exploit.