Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1706

DruidAggregateFilterTransposeRule causes very fine-grained aggregations to be pushed to Druid

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.12.0
    • None
    • None

    Description

      Enabling DruidAggregateFilterTransposeRule may cause very fine-grained aggregations to be pushed to Druid.

      Running DruidAdapterIT.testFilterTimestamp, here is the previous plan (with DruidAggregateFilterTransposeRule disabled):

      EnumerableInterpreter
        BindableAggregate(group=[{}], C=[COUNT()])
          BindableFilter(condition=[AND(>=(/INT(Reinterpret($0), 86400000), 1997-01-01), <(/INT(Reinterpret($0), 86400000), 1998-01-01), OR(AND(>=(/INT(Reinterpret($0), 86400000), 1997-04-01), <(/INT(Reinterpret($0), 86400000), 1997-05-01)), AND(>=(/INT(Reinterpret($0), 86400000), 1997-06-01), <(/INT(Reinterpret($0), 86400000), 1997-07-01))))])
            DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$0]])
      

      Here is the (in my opinion inferior) plan with DruidAggregateFilterTransposeRule enabled:

      EnumerableInterpreter
        BindableAggregate(group=[{}], C=[$SUM0($1)])
          BindableFilter(condition=[AND(=(EXTRACT_DATE(FLAG(YEAR), /INT(Reinterpret($0), 86400000)), 1997), OR(=(EXTRACT_DATE(FLAG(MONTH), /INT(Reinterpret($0), 86400000)), 4), =(EXTRACT_DATE(FLAG(MONTH), /INT(Reinterpret($0), 86400000)), 6)))])
            DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], groups=[{0}], aggs=[[COUNT()]])
      

      Note that the DruidQuery is aggregating on __timestamp. Given that __timestamp is very high cardinality, is this an efficient operation for Druid?

      For this particular query, the ideal would be to push the filter into the intervals clause. Then we would not need to group by __timestamp. I am not sure why this is not happening.

      nishantbangarwa, bslim, How bad is the query with DruidAggregateFilterTransposeRule enabled, in your opinion? Is this a show-stopper for Calcite 1.12?

      Attachments

        Issue Links

          Activity

            People

              julianhyde Julian Hyde
              julianhyde Julian Hyde
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: