[IMPALA-9789] Disable ineffective bloom filters for Kudu scan - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: Impala 3.4.0
Fix Version/s: Impala 4.0.0
Component/s: Backend, Frontend
Labels:
None

Epic Color:
ghx-label-6

Description

In bloom-filter benchmark for Kudu, there is performance regression for query TPCH-Q9. In Profile shows that 5 bloom filters are generated by hash join. Some of those filters are not useful for filtering rows. When pushing all bloom filters to Kudu, the bloom filter evaluations add extra cost for Kudu scan, which cause performance regression.

The regression on Q9 looks a lot like https://issues.apache.org/jira/browse/IMPALA-9302, where Q9 regressed a lot with multithreading initially because ineffective filters weren't being disabled. This query is a bit special in that there are many filters pushed to scan 2, and most of them are not useful. Based on our experience there, we need to add a method to disable ineffective filters for Kudu scan.

Attachments

Issue Links

Dependent

KUDU-3140 Add heuristics to disable predicate evaluation/filtering for Bloom filter predicate

Resolved

is related to

IMPALA-3741 Push bloom filters to Kudu scanners

Resolved

Activity

People

Assignee:: Wenzhe Zhou

Reporter:: Wenzhe Zhou

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 28/May/20 18:04

Updated:: 08/May/22 01:44

Resolved:: 07/Jul/20 16:52