Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.34.0
Description
When the query contains multiple IN/SOME sub-queries connected with OR predicate in WHERE clause, the result is wrong. The minimal reproducer is below:
SQL:
select empno from sales.empnullables where deptno in ( select deptno from sales.deptnullables where name = 'dept1') or deptno in ( select deptno from sales.deptnullables where name = 'dept2')
The Plan generated by calcite master branch: (Notice the bold part of <>($2, 0) in the downstream LogicalFilter)
LogicalProject(EMPNO=[$0]) LogicalProject(EMPNO=[$0], DEPTNO=[$1]) LogicalFilter(condition=[OR(AND(<>($2, 0), IS NOT NULL($5), IS NOT NULL($1)), AND(***<>($2, 0)***, IS NOT NULL($9), IS NOT NULL($1)))]) LogicalJoin(condition=[=($1, $8)], joinType=[left]) LogicalJoin(condition=[true], joinType=[inner]) LogicalJoin(condition=[=($1, $4)], joinType=[left]) LogicalJoin(condition=[true], joinType=[inner]) LogicalProject(EMPNO=[$0], DEPTNO=[$7]) LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]]) LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)]) LogicalProject(DEPTNO=[$0]) LogicalFilter(condition=[=($1, 'dept1')]) LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) LogicalProject(DEPTNO=[$0], i=[true]) LogicalFilter(condition=[=($1, 'dept1')]) LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)]) LogicalProject(DEPTNO=[$0]) LogicalFilter(condition=[=($1, 'dept2')]) LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) LogicalProject(DEPTNO=[$0], i=[true]) LogicalFilter(condition=[=($1, 'dept2')]) LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
The wrong part is that when build the downstream LogicalFilter for the two sub-queries, the filter for the second sub-query is AND(<>($2, 0), IS NOT NULL($9), IS NOT NULL($1)), notice that $2 should be the second sub-query's intermediate table field ct.c(which field index is $6), but now the actual reference is the first sub-query's, this leads to wrong plan, and wrong result.
The root cause is that intermediate table alias is the same as the previous sub-query's, but when lookup intermediate table field, it always returns the previous one which is not belong to the current subquery.
Attachments
Issue Links
- is related to
-
CALCITE-5390 RelDecorrelator throws NullPointerException
- Open
- relates to
-
CALCITE-5680 Wrong plan for multiple IN sub-queries with only literal operands
- Closed
- links to