[CALCITE-4213] Druid plans with small intervals should be chosen over full interval scan plus filter - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: druid-adapter
Labels:
None

Description

The problem was observed due to the failure of DruidAdapterIT#testFilterTimestamp.

 select count(*) as c
from "foodmart"
where extract(year from "timestamp") = 1997
and extract(month from "timestamp") in (4, 6)

Expected

EnumerableInterpreter
 DruidQuery(table=[[foodmart, foodmart]], intervals=[[1997-04-01T00:00:00.000Z/1997-05-01T00:00:00.000Z, 1997-06-01T00:00:00.000Z/1997-07-01T00:00:00.000Z]], projects=[[0]], groups=[{}], aggs=[[COUNT()]])

Actual

EnumerableInterpreter
  DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], filter=[AND(=(EXTRACT(FLAG(YEAR), $0), 1997), OR(=(EXTRACT(FLAG(MONTH), $0), 4), =(EXTRACT(FLAG(MONTH), $0), 6)))], groups=[{}], aggs=[[COUNT()]])

Observe that the actual plan has an interval that basically touches all data so in most cases it is less efficient than the expected one.

Attachments

Issue Links

is related to

CALCITE-4221 Update stale integration tests in Druid adapter

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Stamatis Zampetakis

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 02/Sep/20 10:40

Updated:: 03/Sep/20 06:59