Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-5429

Correlated subquery over aggregate with grouping throws an error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Queries like:

      select (select dname from "scott".dept where dept.deptno = emp.deptno) from "scott".emp group by deptno; 

      Throw an error:

      > java.lang.AssertionError: Required columns {1} not subset of left columns {0}
      >     at org.apache.calcite.util.Litmus.lambda$static$0(Litmus.java:31)
      >     at org.apache.calcite.util.Litmus.check(Litmus.java:76)
      >     at org.apache.calcite.rel.core.Correlate.isValid(Correlate.java:145)
      >     at org.apache.calcite.rel.core.Correlate.<init>(Correlate.java:109)
      >     at org.apache.calcite.rel.logical.LogicalCorrelate.<init>(LogicalCorrelate.java:72)
      >     at org.apache.calcite.rel.logical.LogicalCorrelate.create(LogicalCorrelate.java:115)
      >     at org.apache.calcite.rel.core.RelFactories$CorrelateFactoryImpl.createCorrelate(RelFactories.java:440)
      >     at org.apache.calcite.tools.RelBuilder.join(RelBuilder.java:2864)
      >     at org.apache.calcite.rel.rules.SubQueryRemoveRule.rewriteScalarQuery(SubQueryRemoveRule.java:136)

      It happens because when subquery is converted to LogicalCorrelate, left side of LogicalCorrelate has different row type than expected by correlate field access on the right side. RexFieldAccess expects original (table) row type for $cor0 variable, where column DEPTNO has index 1, but after project and aggregate is applied on the left side, only one column in a row type remains and corresponding column has index 0.

      Plan for this query:

      LogicalProject(EXPR$0=[$SCALAR_QUERY({
      LogicalProject(DNAME=[$1])
        LogicalFilter(condition=[=($0, $cor0.DEPTNO)])
          LogicalTableScan(table=[[scott, DEPT]])
      })]), id = 25096
        LogicalAggregate(group=[{0}]), id = 25094
          LogicalProject(DEPTNO=[$7]), id = 25092
            LogicalTableScan(table=[[scott, EMP]]), id = 25076

      If we change query to:

      select (select dname from "scott".dept where dept.deptno = emp.empno) from "scott".emp group by deptno; 

      It silently return the wrong result (DEPTNO is treated as EMPNO and error is no throwed). 

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              alex_pl Aleksey Plekhanov
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: