Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-770

LogicalAggregate plan node looks incorrect when window functions are present

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.0-incubating
    • Fix Version/s: 1.4.0-incubating
    • Component/s: None
    • Labels:
      None

      Description

      For the following query:

        select min(sal), max(sal), 
            rank() over (order by job), 
            sum(sal) over (order by job) 
          from emp 
        group by job, sal
      

      I would have expected the LogicalAggregate to do the group-by and produce the MIN and MAX only. The plan below shows 2 additional fields: agg#2=RANK and agg#3=SUM($1) . I think this is incorrect because these functions should be associated with the LogicalWindow node only (which does not show up in the plan yet). Converting the LogicalAggregate to a physical plan node creates difficulties due to this. Is the physical aggregate supposed to only look at EXPR$1 and EXPR$2 and ignore agg#1, agg#2 ?

      LogicalProject(EXPR$0=[$2], EXPR$1=[$3], EXPR$2=[RANK() OVER (ORDER BY $0 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)], EXPR$3=[SUM($1) OVER (ORDER BY $0 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)])
        LogicalAggregate(group=[{0, 1}], EXPR$0=[MIN($1)], EXPR$1=[MAX($1)], agg#2=[RANK()], agg#3=[SUM($1)])
          LogicalProject(JOB=[$2], SAL=[$5])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      

        Activity

        Hide
        julianhyde Julian Hyde added a comment -

        Yes, the plan needs to be

        LogicalProject(EXPR$0=[$2], EXPR$1=[$3], EXPR$2=[RANK() OVER (ORDER BY $0 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)], EXPR$3=[SUM($1) OVER (ORDER BY $0 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)])
          LogicalAggregate(group=[{0, 1}], EXPR$0=[MIN($1)], EXPR$1=[MAX($1)])
            LogicalProject(JOB=[$2], SAL=[$5])
              LogicalTableScan(table=[[CATALOG, SALES, EMP]])
        

        The bug is in SqlToRelConverter. When creating an Aggregate and looking for aggregate functions, it needs to ignore windowed aggregate functions.

        Add tests to SqlToRelConverterTest, and add some queries to winagg.oq. Add a test where this query is the LHS of a join; if there are spurious columns, the offsets to the RHS of the join will be wrong. Also add a test of a winagg on an agg on a join, and a query with winagg, agg, and HAVING clause.

        Show
        julianhyde Julian Hyde added a comment - Yes, the plan needs to be LogicalProject(EXPR$0=[$2], EXPR$1=[$3], EXPR$2=[RANK() OVER (ORDER BY $0 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)], EXPR$3=[SUM($1) OVER (ORDER BY $0 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)]) LogicalAggregate(group=[{0, 1}], EXPR$0=[MIN($1)], EXPR$1=[MAX($1)]) LogicalProject(JOB=[$2], SAL=[$5]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) The bug is in SqlToRelConverter. When creating an Aggregate and looking for aggregate functions, it needs to ignore windowed aggregate functions. Add tests to SqlToRelConverterTest, and add some queries to winagg.oq. Add a test where this query is the LHS of a join; if there are spurious columns, the offsets to the RHS of the join will be wrong. Also add a test of a winagg on an agg on a join, and a query with winagg, agg, and HAVING clause.
        Hide
        amansinha100 Aman Sinha added a comment -

        Thanks for the input. I am working on a fix. The fact that window aggregates and regular aggregates both return true for isAggregator() causes problems, so I need to look for SqlOverOperator.

        Show
        amansinha100 Aman Sinha added a comment - Thanks for the input. I am working on a fix. The fact that window aggregates and regular aggregates both return true for isAggregator() causes problems, so I need to look for SqlOverOperator.
        Hide
        amansinha100 Aman Sinha added a comment -

        Patch is available at: https://github.com/amansinha100/incubator-calcite/tree/window_1
        I have added tests as suggested by Julian Hyde.
        For some reason running the winagg.oq test shows a formatting diff. I just cut-pasted the output from ./core/target/surefire/sql/winagg.oq. May need someone's help on that.

        Show
        amansinha100 Aman Sinha added a comment - Patch is available at: https://github.com/amansinha100/incubator-calcite/tree/window_1 I have added tests as suggested by Julian Hyde . For some reason running the winagg.oq test shows a formatting diff. I just cut-pasted the output from ./core/target/surefire/sql/winagg.oq. May need someone's help on that.
        Hide
        amansinha100 Aman Sinha added a comment -
        Show
        amansinha100 Aman Sinha added a comment - Pull request: https://github.com/apache/incubator-calcite/pull/99
        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/e1378123 .
        Hide
        jnadeau Jacques Nadeau added a comment -

        Resolved in release 1.4.0-incubating (2015-08-23)

        Show
        jnadeau Jacques Nadeau added a comment - Resolved in release 1.4.0-incubating (2015-08-23)

          People

          • Assignee:
            amansinha100 Aman Sinha
            Reporter:
            amansinha100 Aman Sinha
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development