[KYLIN-740] Slowness with many IN() values - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: v1.1, v1.4.0
Component/s: None
Labels:
None

Sprint:
Sprint 43, Sprint 44, Sprint 45, Sprint 46, Sprint 47, Sprint 48

Description

[from dev mail list]

Huang Hua <huanghua@mininglamp.com>

We were testing the query efficiency with some sql containing 'IN' keyword
on a relatively big table which holds 22+ Million records.

It seems to us that sql with a number of items in 'IN' show very poor
performance in terms of running time.

The fact table used has three columns: themonth, id, trans_at, and we
created the cube with 2 dimensions and 1 measure:

Dimensions: themonth, id

Measure: sum(trans_at)

Basically our queries are all similar to the one:

select themonth, id, sum(trans_at) from the_table where id IN (id1, id2,
id3, .) group by themonth, id limit 10

When the number of ids reaches about 30-50, the response time becomes
considerable long in terms of minutes. By the way, we are running kylin on
12 nodes with 2 name nodes and 10 data nodes.

Besides, we tried to break the above sql with 'IN' keyword into several
sub-sql and merge the results of each sub-sql to get the final result like
the following:

select themonth, id, sum(trans_at) from the_table where id = id1 group by
themonth, id limit 10

select themonth, id, sum(trans_at) from the_table where id = id2 group by
themonth, id limit 10

select themonth, id, sum(trans_at) from the_table where id = id3 group by
themonth, id limit 10

Surprisedly, the total running time of those sub-sql is much less than the
running time of the orginal sql with 'IN' keyword.

Initially I thought the backend query engine should handle 'IN' keyword in
the similar way as the individual sql with '=' keyword, but it seems not.

Can anybody provide any thoughts regarding this? Any ideas on how to tune
the queries containing 'IN' keyword?

Attachments

Activity

People

Assignee:: Hongbin Ma

Reporter:: liyang

Votes:: 3 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 29/Apr/15 03:06

Updated:: 26/Oct/15 03:25

Resolved:: 01/Sep/15 04:07