Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4202

Refine Druid cost-model to capture differences in intermediate projections

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • druid-adapter
    • None

    Description

      The planner generates equivalent DruidQuery expressions with exactly the same cost. Most of the time the expressions differ only in the number of intermediate projections

      For example, running the following query

      select distinct "countryName"
      from "wiki"
      where "page" = 'Jeremy Corbyn'
      

      via DruidAdapterIT#testSelectDistinctWiki generates among others the following alternatives during optimization.

      Choice 1

      rel#184:DruidQuery.BINDABLE.[](table=[wiki, wiki],intervals=[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z],filter==($13, 'Jeremy Corbyn'),projects=[$5, $13],groups={0},aggs=[])
      

      Choice 2

      rel#108:DruidQuery.BINDABLE.[](table=[wiki, wiki],intervals=[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z],filter==($13, 'Jeremy Corbyn'),projects=[$5],groups={0},aggs=[])
      

      Using the debugger we can see that the cost of the two plans is exactly the same (although they are different) which means that the one that was generated first will dominate the other. Clearly in this case the second choice is a better plan.

      Performance wise the difference may not be that big but refining the cost is beneficial at least for plan stability. Currently the final plan is dependent on the order that the rules are applied.

      The goal of this jira is to refine Druid's cost model so that choice 2 becomes cheaper than choice 1 outlined above.

      Attachments

        Activity

          People

            Unassigned Unassigned
            zabetak Stamatis Zampetakis
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: