Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4213

Druid plans with small intervals should be chosen over full interval scan plus filter

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • druid-adapter
    • None

    Description

      The problem was observed due to the failure of DruidAdapterIT#testFilterTimestamp.

       select count(*) as c
      from "foodmart"
      where extract(year from "timestamp") = 1997
      and extract(month from "timestamp") in (4, 6)
      

      Expected

      EnumerableInterpreter
       DruidQuery(table=[[foodmart, foodmart]], intervals=[[1997-04-01T00:00:00.000Z/1997-05-01T00:00:00.000Z, 1997-06-01T00:00:00.000Z/1997-07-01T00:00:00.000Z]], projects=[[0]], groups=[{}], aggs=[[COUNT()]])
      

      Actual

      EnumerableInterpreter
        DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], filter=[AND(=(EXTRACT(FLAG(YEAR), $0), 1997), OR(=(EXTRACT(FLAG(MONTH), $0), 4), =(EXTRACT(FLAG(MONTH), $0), 6)))], groups=[{}], aggs=[[COUNT()]])
      

      Observe that the actual plan has an interval that basically touches all data so in most cases it is less efficient than the expected one.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              zabetak Stamatis Zampetakis
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: