Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.35.0
-
None
Description
The following test will fail if added to RelOptRulesTest.java
@Test void testCanNotPushSemiJoinToRightJoinBranch() { final Function<RelBuilder, RelNode> relFn = b -> b .scan("EMP") .scan("DEPT") .join(JoinRelType.LEFT, b.equals( b.field(2, 0, "DEPTNO"), b.field(2, 1, "DEPTNO")) ) .scan("BONUS") // semi join only relates to RHS fields of left join .semiJoin(b.equals( b.field(2, 0, "DNAME"), b.field(2, 1, "JOB"))) .build(); relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged(); }
Produced plan will look like:
LogicalJoin(condition=[=($7, $8)], joinType=[left]) LogicalTableScan(table=[[scott, EMP]]) LogicalJoin(condition=[=($1, $4)], joinType=[semi]) LogicalTableScan(table=[[scott, DEPT]]) LogicalTableScan(table=[[scott, BONUS]])
Which is different from the original plan:
LogicalJoin(condition=[=($9, $12)], joinType=[semi]) LogicalJoin(condition=[=($7, $8)], joinType=[left]) LogicalTableScan(table=[[scott, EMP]]) LogicalTableScan(table=[[scott, DEPT]]) LogicalTableScan(table=[[scott, BONUS]])
This is not correct - in general case it is not correct to push semi-join to right side of left-join.
The reason is the following:
Consider rows from EMP that have no matching rows in DEPT. These rows will have nulls for DEPT columns in the result of left-join and they will be rejected by the top semi-join.
But if we push semi-join to RHS of left-join, we are going to see rows from EMP with nulls on the DEPT side in the final result.