Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.5.0
    • Component/s: None
    • Labels:
      None

      Description

      The lack of Project under LogicalWindow hurts the performance.

      Firstly of all, this issue happens when HepPlanner is used with ProjectToWindowRule.PROJECT rule.

      A simple query like:

      select sum(deptno) over(partition by deptno) as sum1 
      from emp
      

      produces

      LogicalProject($0=[$9])
        LogicalWindow(window#0=[window(partition {7} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($7)])])
          LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      

      However, from performance standpoint, it is better to have a project between LogicalWindow and LogicalTableScan since only one column is used. Interestingly, when there is an expression in the window function. For example,

      select sum(deptno + 1) over(partition by deptno) as sum1 
      from emp"
      

      produces

      LogicalProject($0=[$2])
        LogicalWindow(window#0=[window(partition {0} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])])
          LogicalProject(DEPTNO=[$7], $1=[+($7, 1)])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      

      The LogicalProject below window can trim out useless columns or even be pushed into Scan, which is very important optimization Calcite can exploit.

        Activity

        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        Resolved in release 1.5.0 (2015-11-10)

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - Resolved in release 1.5.0 (2015-11-10)
        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/5219eb8c . Thanks for the patch, Sean Hsuan-Yi Chu !
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        You mean implementing this new method in RelFieldTrimmer?

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - You mean implementing this new method in RelFieldTrimmer?
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        I saw you do some modification and committed to your branch. Thanks!

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - I saw you do some modification and committed to your branch. Thanks!
        Hide
        julianhyde Julian Hyde added a comment -

        The patch looks good.

        Did you consider adding

        RelFieldTrimmer.trimFields(
              Window window,
              ImmutableBitSet fieldsUsed,
              Set<RelDataTypeField> extraFields)

        I think it might have accomplished the same goal.

        Show
        julianhyde Julian Hyde added a comment - The patch looks good. Did you consider adding RelFieldTrimmer.trimFields( Window window, ImmutableBitSet fieldsUsed, Set<RelDataTypeField> extraFields) I think it might have accomplished the same goal.
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        Julian Hyde Before you start reviewing process, could you first take a look at the comment here.
        Essentially, we could have two ways to add that LogicalProject below LogicalWindow:
        (1). In CalcRelSplitter:
        As Calcite tries to split expression, it "might" be possible to add a level at the bottom to create the project. However, there are two issues with this approach. Firstly, the splitter has the objective of using the least levels to split expressions. Even though adding this LogicalProject helps the performance overall, this operation seems contradicting the objective of this rule (Sounds doing too many things in a single rule). Secondly, it is very much difficult to go with this approach without major modification in the code. (At least, I still cannot find a proper place in the logic to do this thing.)

        (2). Add a new rule (as implemented in the pull request):
        Based on the points above, I think it is better to go with this route.

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - Julian Hyde Before you start reviewing process, could you first take a look at the comment here. Essentially, we could have two ways to add that LogicalProject below LogicalWindow: (1). In CalcRelSplitter: As Calcite tries to split expression, it "might" be possible to add a level at the bottom to create the project. However, there are two issues with this approach. Firstly, the splitter has the objective of using the least levels to split expressions. Even though adding this LogicalProject helps the performance overall, this operation seems contradicting the objective of this rule (Sounds doing too many things in a single rule). Secondly, it is very much difficult to go with this approach without major modification in the code. (At least, I still cannot find a proper place in the logic to do this thing.) (2). Add a new rule (as implemented in the pull request): Based on the points above, I think it is better to go with this route.
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        Julian Hyde Can you help review? Thanks!

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - Julian Hyde Can you help review? Thanks!
        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - Pull request: https://github.com/apache/incubator-calcite/pull/123

          People

          • Assignee:
            julianhyde Julian Hyde
            Reporter:
            seanhychu Sean Hsuan-Yi Chu
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development