Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-5669

Add rules to remove correlate when one of its inputs is empty

Rank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.35.0
    • core

    Description

      Consider the following query correlated query.

      select * from emp as e where exists (select 1 from dept as d where e.empno = null)
      

      The query basically returns an empty result because e.empno = null is always false.

      The plan for the query after applying the sub-query remove rule is shown below:

      LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
        LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
          LogicalCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{0}])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
            LogicalAggregate(group=[{0}])
              LogicalProject(i=[true])
                LogicalFilter(condition=[=($cor0.EMPNO, null)])
                  LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
      

      After applying the reduce expressions rule the filter with the correlated condition will become false and the resulting plan would be the following.

      LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
        LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
          LogicalCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{0}])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
            LogicalAggregate(group=[{0}])
              LogicalProject(i=[true])
                LogicalValues(tuples=[[]])
      

      Observe that now we have a LogicalCorrelate but there is no real correlation in the plan since the correlation variable on the right side disappeared. Depending on how rules are applied and which rules are used similar "trivial" correlates may appear.

      The goal of this ticket is to provide the means to get rid of them.

      One option would be to add a new rule (e.g., CorrelateToJoinRule) which detects that a correlate does not have correlations in the right side and turn the correlation to a join; then we could employ other existing rules (such as PruneEmptyRules) for joins and remove the newly created join altogether.

      Another option, would be to introduce new pruning rule(s) for correlate (similar to those for joins) that will remove the correlate when its input is an empty values expression.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            zabetak Stamatis Zampetakis
            zabetak Stamatis Zampetakis
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment