Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-3508

Strengthen outer Join to inner if it is under a Filter that discards null values

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None

      Description

      Today, FilterJoinRule given an Outer Join the rule strengthens it to Inner Join when the nullable side contains a filter IS_NOT_NULL. Below is the code.

      for (RexNode filter : aboveFilters) {
            if (joinType.generatesNullsOnLeft()
                && Strong.isNotTrue(filter, leftBitmap)) {
              joinType = joinType.cancelNullsOnLeft();
            }
            if (joinType.generatesNullsOnRight()
                && Strong.isNotTrue(filter, rightBitmap)) {
              joinType = joinType.cancelNullsOnRight();
            }
            if (!joinType.isOuterJoin()) {
              break;
            }
          }
      

      This code looks at the filter to determine if it is always true, then it can alter the join type by removing the null on that side.

      We can see this in the following test RelOptRules#testStrengthenJoinType, which executes the following SQL that transforms from a LEFT OUTER JOIN to an INNER JOIN

      select *
      from dept left join emp on dept.deptno = emp.deptno
      where emp.deptno is not null and emp.sal > 100
      

      This ticket is about broadening the application of this rule to a sql like the following:

      select *
      from dept left join emp on dept.deptno = emp.deptno
      where emp.sal > 100
      

       This originally came up on the mailing list: https://mail-archives.apache.org/mod_mbox/calcite-dev/201909.mbox/%3CCAGzrZ38hh%2B8B8TG8gVHUfVKunJZ4L%2BK7rmrinQxpwOHcdJbzGQ%40mail.gmail.com%3E

      and in that thread it was pointed out that there are filters that prevent this from being applied:

      SELECT b.title
      FROM Book b
      LEFT JOIN Author a ON b.author = a.id
      WHERE a.name <> 'Victor'
      

      This means we need to ensure we that the OUTER JOIN doesn't contain – for lack of a different term – negation filters. If there is a negation – like NOT_EQUAL – the JOIN cannot be strengthened.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ScottReynolds Scott Reynolds
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 50m
                  2h 50m