Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1842

Wrong order of inputs for makeCost() call in Sort.computeSelfCost()

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.14.0
    • Component/s: core
    • Labels:
      None
    • Flags:
      Important

      Description

      Original code in Sort.java

      @Override public RelOptCost computeSelfCost(RelOptPlanner planner,
            RelMetadataQuery mq) {
          // Higher cost if rows are wider discourages pushing a project through a
          // sort.
          double rowCount = mq.getRowCount(this);
          double bytesPerRow = getRowType().getFieldCount() * 4;
          return planner.getCostFactory().makeCost(
              Util.nLogN(rowCount) * bytesPerRow, rowCount, 0);
      

      The last line should be

      return planner.getCostFactory().makeCost(
              rowCount/*rowCount*/, Util.nLogN(rowCount) * bytesPerRow/*cpu*/, 0/*io*/);
      

      The wrong order will make the planner choose the wrong physical plan. For example, if the druid query has a limit of 10 with 10+ dimensions, the optimizer will choose not push the "limit" down to druid instead choose scanning entire data source in druid.

      The fix is very easy, the gain is huge as the performance of the wrong plan is really bad. Hope it will be picked up by the next release.

        Activity

        Hide
        axeisghost Junxian Wu added a comment -

        We also encounter this issue when we used Calcite to connect druid. We may want this fix sooner so a pull request has been opened in public github repo.

        https://github.com/apache/calcite/pull/491

        Right now test cases are not added yet because I did not find examples that can generally check cost before optimization with rules provided by different adapters. Please let me know if you have any idea about the test cases that can check this fix. Thank you.

        Show
        axeisghost Junxian Wu added a comment - We also encounter this issue when we used Calcite to connect druid. We may want this fix sooner so a pull request has been opened in public github repo. https://github.com/apache/calcite/pull/491 Right now test cases are not added yet because I did not find examples that can generally check cost before optimization with rules provided by different adapters. Please let me know if you have any idea about the test cases that can check this fix. Thank you.
        Hide
        julianhyde Julian Hyde added a comment -

        Reviewing now.

        Show
        julianhyde Julian Hyde added a comment - Reviewing now.
        Hide
        julianhyde Julian Hyde added a comment -

        Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/48a20668. No test case needed -
        one of the existing tests generated an improved plan. Thanks for the PR, Junxian Wu!

        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/48a20668 . No test case needed - one of the existing tests generated an improved plan. Thanks for the PR, Junxian Wu !
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        Junxian Wu, Julian Hyde, this fix causes three DruidAdapterIT tests to fail:

        Results :
        
        Failed tests:
          DruidAdapterIT.testGroupByFloorTimeWithLimit:1884
        Expected: a string containing "PLAN=EnumerableLimit(fetch=[3])\n  EnumerableInterpreter\n    DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[FLOOR($0, FLAG(MONTH))]], groups=[{0}], aggs=[[]], sort0=[0], dir0=[DESC])"
             but: was "PLAN=EnumerableInterpreter
          BindableSort(sort0=[$0], dir0=[DESC], fetch=[3])
            DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[FLOOR($0, FLAG(MONTH))]], groups=[{0}], aggs=[[]], sort0=[0], dir0=[DESC])
        
        "
          DruidAdapterIT.testSortLimit:530
        Expected: a string containing "PLAN=EnumerableLimit(offset=[2], fetch=[3])\n  EnumerableInterpreter\n    DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$39, $30]], groups=[{0, 1}], aggs=[[]], sort0=[1], sort1=[0], dir0=[ASC], dir1=[DESC])"
             but: was "PLAN=EnumerableInterpreter
          BindableSort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[DESC], offset=[2], fetch=[3])
            DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$39, $30]], groups=[{0, 1}], aggs=[[]], sort0=[1], sort1=[0], dir0=[ASC], dir1=[DESC])
        
        "
          DruidAdapterIT.testTopNMonthGranularity:1200
        Expected: a string containing "PLAN=EnumerableInterpreter\n  BindableProject(S=[$2], M=[$3], P=[$0])\n    DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$30, FLOOR($0, FLAG(MONTH)), $89]], groups=[{0, 1}], aggs=[[SUM($2), MAX($2)]], sort0=[2], dir0=[DESC], fetch=[3])"
             but: was "PLAN=EnumerableCalc(expr#0..3=[{inputs}], S=[$t2], M=[$t3], P=[$t0])
          EnumerableInterpreter
            DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$30, FLOOR($0, FLAG(MONTH)), $89]], groups=[{0, 1}], aggs=[[SUM($2), MAX($2)]], sort0=[2], dir0=[DESC], fetch=[3])
        
        "
        
        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - Junxian Wu , Julian Hyde , this fix causes three DruidAdapterIT tests to fail: Results : Failed tests: DruidAdapterIT.testGroupByFloorTimeWithLimit:1884 Expected: a string containing "PLAN=EnumerableLimit(fetch=[3])\n EnumerableInterpreter\n DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[FLOOR($0, FLAG(MONTH))]], groups=[{0}], aggs=[[]], sort0=[0], dir0=[DESC])" but: was "PLAN=EnumerableInterpreter BindableSort(sort0=[$0], dir0=[DESC], fetch=[3]) DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[FLOOR($0, FLAG(MONTH))]], groups=[{0}], aggs=[[]], sort0=[0], dir0=[DESC]) " DruidAdapterIT.testSortLimit:530 Expected: a string containing "PLAN=EnumerableLimit(offset=[2], fetch=[3])\n EnumerableInterpreter\n DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$39, $30]], groups=[{0, 1}], aggs=[[]], sort0=[1], sort1=[0], dir0=[ASC], dir1=[DESC])" but: was "PLAN=EnumerableInterpreter BindableSort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[DESC], offset=[2], fetch=[3]) DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$39, $30]], groups=[{0, 1}], aggs=[[]], sort0=[1], sort1=[0], dir0=[ASC], dir1=[DESC]) " DruidAdapterIT.testTopNMonthGranularity:1200 Expected: a string containing "PLAN=EnumerableInterpreter\n BindableProject(S=[$2], M=[$3], P=[$0])\n DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$30, FLOOR($0, FLAG(MONTH)), $89]], groups=[{0, 1}], aggs=[[SUM($2), MAX($2)]], sort0=[2], dir0=[DESC], fetch=[3])" but: was "PLAN=EnumerableCalc(expr#0..3=[{inputs}], S=[$t2], M=[$t3], P=[$t0]) EnumerableInterpreter DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$30, FLOOR($0, FLAG(MONTH)), $89]], groups=[{0, 1}], aggs=[[SUM($2), MAX($2)]], sort0=[2], dir0=[DESC], fetch=[3]) "
        Hide
        julianhyde Julian Hyde added a comment -

        My bad. As committer of this change I should have checked Druid. I'll fix it today.

        Show
        julianhyde Julian Hyde added a comment - My bad. As committer of this change I should have checked Druid. I'll fix it today.
        Hide
        julianhyde Julian Hyde added a comment -
        Show
        julianhyde Julian Hyde added a comment - I've fixed the plan differences in http://git-wip-us.apache.org/repos/asf/calcite/commit/05595f64 .
        Hide
        axeisghost Junxian Wu added a comment -

        Sorry, I did not look at JIRA recently. It seems the plan changed a little bit after the cost of sort computed differently. Thanks Julian for fixing test cases.

        Show
        axeisghost Junxian Wu added a comment - Sorry, I did not look at JIRA recently. It seems the plan changed a little bit after the cost of sort computed differently. Thanks Julian for fixing test cases.
        Hide
        julianhyde Julian Hyde added a comment -

        I'm not totally happy with the new plans. There was one case where DruidQuery did a sort and it was followed (I think) a BindableSort doing both a sort and a fetch/offset, when a fetch/offset would be sufficient. And other cases where BindableSort or BindableProject is used and EnumerableSort would be better. I spent a few hours yesterday trying to improve them, but couldn't, without breaking other stuff.

        So, Jesus Camacho Rodriguez and slim bouguerra, please keep an eye on the plans in DruidAdapterIT and make sure they're not getting too bad.

        Show
        julianhyde Julian Hyde added a comment - I'm not totally happy with the new plans. There was one case where DruidQuery did a sort and it was followed (I think) a BindableSort doing both a sort and a fetch/offset, when a fetch/offset would be sufficient. And other cases where BindableSort or BindableProject is used and EnumerableSort would be better. I spent a few hours yesterday trying to improve them, but couldn't, without breaking other stuff. So, Jesus Camacho Rodriguez and slim bouguerra , please keep an eye on the plans in DruidAdapterIT and make sure they're not getting too bad.
        Hide
        michaelmior Michael Mior added a comment -

        Resolved in release 1.14.0 (2017-10-01)

        Show
        michaelmior Michael Mior added a comment - Resolved in release 1.14.0 (2017-10-01)

          People

          • Assignee:
            julianhyde Julian Hyde
            Reporter:
            jdhwlp JD Zheng
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 1h
              1h
              Remaining:
              Remaining Estimate - 1h
              1h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development