[CALCITE-1656] Improve cost function in DruidQuery to encourage early column pruning - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.12.0
Component/s: druid-adapter
Labels:
- performance

Description

Consider below query -

select "countryName", floor("time" to DAY), cast(count(*) as integer) as c
         from "wiki"
         where floor("time" to DAY) >= '1997-01-01 00:00:00' and          floor("time" to DAY) < '1997-09-01 00:00:00'
         group by "countryName", floor("time" TO DAY)
         order by c limit 5

resulting Druid Query -

{
  "queryType": "select",
  "dataSource": "wikiticker",
  "descending": false,
  "intervals": [
    "1900-01-09T00:00:00.000/2992-01-10T00:00:00.000"
  ],
  "dimensions": [
    "channel",
    "cityName",
    "comment",
    "countryIsoCode",
    "countryName",
    "isAnonymous",
    "isMinor",
    "isNew",
    "isRobot",
    "isUnpatrolled",
    "metroCode",
    "namespace",
    "page",
    "regionIsoCode",
    "regionName",
    "user"
  ],
  "metrics": [
    "count",
    "added",
    "deleted",
    "delta",
    "user_unique"
  ],
  "granularity": "all",
  "pagingSpec": {
    "threshold": 16384,
    "fromNext": true
  },
  "context": {
    "druid.query.fetch": false
  }
}

Note that the above druid query has extra dimensions which are not required.

Attachments

Activity

People

Assignee:: Nishant Bangarwa

Reporter:: Nishant Bangarwa

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 24/Feb/17 13:33

Updated:: 24/Mar/17 03:20

Resolved:: 04/Mar/17 21:06