[CALCITE-1828] Push the FILTER clause into Druid as a Filtered Aggregator - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 1.12.0
Fix Version/s: 1.14.0
Component/s: druid-adapter
Labels:
None

Description

Druid has support for a special aggregator it calls the Filtered Aggregator that allows aggregations to occur with filters independent to other filters in the Druid query.

An example where the filtered aggregator is useful:

SELECT 
sum("col1") FILTER (WHERE <condition1>),
sum("col2") FILTER (WHERE <condition2>)
FROM "table";

Currently, calcite will scan Druid, then do the filtering and aggregation itself. With filtered aggregators, both the filter and aggregation and be pushed into Druid.

A few comments/questions:

1) If all conditions in the filter clause are the same, then instead of pushing filtered aggregators individually, it would make more sense to push 1 single filter into the Druid query. I.e the filters can be factored out into 1 filter. I don't see calcite currently do this, does it have such a rule in place already?

2) The filters can/should only be pushed if they are filtering on dimension columns

3) Currently, the above query would create the following relation:
DruidQuery -> Project -> Aggregate. There is already a rule called DruidAggregateProjectRule which matches the previous relation. Is it better to add logic to that rule, or to create a new rule that also matches that relation?

Attachments

Issue Links

links to

Pull Request

Activity

People

Assignee:: Zain Humayun

Reporter:: Zain Humayun

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 02/Jun/17 18:37

Updated:: 27/Feb/24 22:24

Resolved:: 10/Jul/17 18:21