[IMPALA-2707] Add FindOrInsert method to hash table to avoid unnecessary probe in aggregation - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: Impala 2.3.0
Fix Version/s: Impala 2.5.0
Component/s: None
Labels:
- performance
- ramp-up

Target Version:

Product Backlog

Description

For each input row the aggregation node uses HashTable::Find() followed by HashTable::Insert() if the grouping key isn't already present in the table. Both of these methods probe the hash table to find the same bucket. If we added a FindOrInsert() method to the hash table that returned a modifiable iterator pointing to the bucket, we could save a significant number of hash table probes.

There is already a TODO in the partitioned-aggregation-node-ir.cc code for this, so I'm creating a JIRA to track the issue.

This could speed up aggregations with large output size significantly, e.g. TPC-H query 13 (see ~~IMPALA-2470~~).

Attachments

Issue Links

is a child of

IMPALA-2755 Clean up memory management in backend

Resolved

Activity

People

Assignee:: Tim Armstrong

Reporter:: Tim Armstrong

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 24/Nov/15 18:00

Updated:: 09/Dec/15 00:19

Resolved:: 07/Dec/15 19:34