Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4213

Druid plans with small intervals should be chosen over full interval scan plus filter

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: druid-adapter
    • Labels:
      None

      Description

      The problem was observed due to the failure of DruidAdapterIT#testFilterTimestamp.

       select count(*) as c
      from "foodmart"
      where extract(year from "timestamp") = 1997
      and extract(month from "timestamp") in (4, 6)
      

      Expected

      EnumerableInterpreter
       DruidQuery(table=[[foodmart, foodmart]], intervals=[[1997-04-01T00:00:00.000Z/1997-05-01T00:00:00.000Z, 1997-06-01T00:00:00.000Z/1997-07-01T00:00:00.000Z]], projects=[[0]], groups=[{}], aggs=[[COUNT()]])
      

      Actual

      EnumerableInterpreter
        DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], filter=[AND(=(EXTRACT(FLAG(YEAR), $0), 1997), OR(=(EXTRACT(FLAG(MONTH), $0), 4), =(EXTRACT(FLAG(MONTH), $0), 6)))], groups=[{}], aggs=[[COUNT()]])
      

      Observe that the actual plan has an interval that basically touches all data so in most cases it is less efficient than the expected one.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                zabetak Stamatis Zampetakis
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: