Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4212

Revisit cost-model to break ties between Enumerable and Bindable expressions

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Most Enumerable and Bindable expressions use exactly the same cost function to compute cost. Depending on the query this may lead to different equivalent (sub) plans with exactly the same cost. This makes the plans dependent on the order that the rules are applied.

      Let's consider for example the following query present in DruidAdapterIT#testProject

      select "product_name", 0 as zero
      from "foodmart"
      order by "product_name";
      

      At some point during planning the optimizer needs to decide between the following plans:

      Choice 1

      EnumerableSort(sort0=[$0], dir0=[ASC]), id = 37
        EnumerableInterpreter(subset=[rel#23:RelSubset#1.ENUMERABLE.[]]), id = 43
          DruidQuery(subset=[rel#26:RelSubset#1.BINDABLE.[]], table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], projects=[[$3, 0]]), id = 25
      

      Choice 2

      EnumerableInterpreter, id = 61
        BindableSort(subset=[rel#40:RelSubset#1.BINDABLE.[0]], sort0=[$0], dir0=[ASC]), id = 41
          DruidQuery(subset=[rel#26:RelSubset#1.BINDABLE.[]], table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], projects=[[$3, 0]]), id = 25
      

      Both choices have exactly the same cost since BindableSort and EnumerableSort use the same cost function (Sort#computeSelfCost, RelMdRowCount#getRowCount(Sort, RelMetadataQuery)).

      The issue can appear with various other expressions such as Project, SetOp, etc.

      Although the example is taken from the Druid adapter the same can happen if both Bindable and Enumerable conventions are used during planning in other use-cases.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                zabetak Stamatis Zampetakis
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: