Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
Description
Currently each aggregation operation uses a separate hash table. This has the following disadvantages:
- Multiple probes happen for the same key, once per hash table.
- Space for keys is duplicated across hash tables.
- Mutexes are acquired once per aggregation operation for each value.
A more efficient design is to have a common hash table where the hash table payload for each key is partitioned among multiple aggregation handles.
Changes are needed both to the aggregate and merge operations.
Preliminary experiments suggest up to 3x speedup for TPC-H Q1 that has 8 aggregation operations.
Attachments
Issue Links
- links to