Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.2, Impala 2.3.0
-
None
Description
The planner should use statistics and simple heuristics to order filters based on selectivity and cost applying the filter per tuple highly selective filters should be applied before less selective ones
Ordering can be as follows
- C1 = 10
- C1 in (1,2,3)
- C1 < 10
- C1 between 10 and 20
- C1 like '%A'
- C1 like '%A%'
- Other....
For the query below the correct ordering gives a 30% speedup.
Query
select
count(*)
from
lineitem
where
l_comment like '%a%';
and l_orderkey = 1024
Will need to do a followup experiment to calculate the cost of filters for different data types.
Attachments
Issue Links
- is related to
-
IMPALA-3635 Order group by clause based on cardinality and cost
- Resolved