Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.26.0
-
None
Description
RelBuilder#join produces wrong/invalid relational expressions when correlated variables are passed as a parameter along with different join types and non trivial (always true) conditions.
Wrong plans exist already in the code base where the requiredColumns attribute in LogicalCorrelate is missing some columns.
Consider for instance the middle plan in RelOptRulesTest#testSelectNotInCorrelated:
LogicalProject(SAL=[$5], EXPR$1=[IS NULL($10)]) LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{2}]) <-- PROBLEM LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalFilter(condition=[=($cor0.EMPNO, $0)]) <-- $cor0.EMPNO refers to column 0 in EMP relation LogicalProject(DEPTNO=[$0], i=[true]) LogicalFilter(condition=[=($cor0.JOB, $1)]) <-- $cor0.JOB refers to column 2 in EMP relation LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
EMPNO column (index 0) that is referenced in the correlation in the right input is not present in the requiredColumns attribute.
Invalid plans are created when the join type is SEMI or ANTI and the join condition uses columns from the right side. Currently, the join condition is added after the Correlate and columns from right side no longer exist thus the filter does not reference valid inputs.
If we are lucky we will get an AssertionError when constructing the Filter operator:
RexInputRef index 8 out of range 0..7 java.lang.AssertionError: RexInputRef index 8 out of range 0..7 at org.apache.calcite.util.Litmus$1.fail(Litmus.java:32) at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:125) at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:61) at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:114) at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:144) at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:61) at org.apache.calcite.rex.RexCall.accept(RexCall.java:189) at org.apache.calcite.rel.core.Filter.isValid(Filter.java:127) at org.apache.calcite.rel.logical.LogicalFilter.<init>(LogicalFilter.java:72) at org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:116) at org.apache.calcite.rel.core.RelFactories$FilterFactoryImpl.createFilter(RelFactories.java:345) at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1349) at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1307) at org.apache.calcite.tools.RelBuilder.join(RelBuilder.java:2407)
otherwise (assertions disabled) we will end up with an invalid plan.
RelNode root = builder .scan("EMP") .variable(v) .scan("DEPT") .join(type, builder.equals( builder.field(2, 0, "DEPTNO"), builder.field(2, 1, "DEPTNO")), ImmutableSet.of(v.get().id)) .build();
Actual plan
LogicalFilter(condition=[=($7, $8)]) <- PROBLEM I LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}]) <- PROBLEM II LogicalTableScan(table=[[scott, EMP]]) LogicalTableScan(table=[[scott, DEPT]])
Expected plan
LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{7}]) LogicalTableScan(table=[[scott, EMP]]) LogicalFilter(condition=[=($cor0.DEPTNO, $0)]) LogicalTableScan(table=[[scott, DEPT]])
Attachments
Issue Links
- is related to
-
HIVE-24999 HiveSubQueryRemoveRule generates invalid plan for IN subquery with multiple correlations
- Closed
- links to