[HIVE-1721] use bloom filters to improve the performance of joins - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Query Processor
Labels:

Description

In case of map-joins, it is likely that the big table will not find many matching rows from the small table.
Currently, we perform a hash-map lookup for every row in the big table, which can be pretty expensive.

It might be useful to try out a bloom-filter containing all the elements in the small table.
Each element from the big table is first searched in the bloom filter, and only in case of a positive match,
the small table hash table is explored.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hive-1721.patch.txt
15/Aug/12 21:22
13 kB
Siddhartha Gunda

Issue Links

relates to

HIVE-11306 Add a bloom-1 filter for Hybrid MapJoin spills

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Namit Jain

Votes:: 4 Vote for this issue

Watchers:: 22 Start watching this issue

Dates

Created:: 16/Oct/10 00:53

Updated:: 18/Jul/15 06:46