Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
Previous implementation of hash group by converts input ExecBatches to row-oriented format,
then hashes and compares rows as if they were a single column.
It is more efficient (especially for small number of key columns) to avoid relatively costly
encoding and instead compute hashes of individual columns in column-oriented format mixing them together, and similarly comparing column-oriented data to row-oriented data in the hash table without converting.
Encoding only happens for a subset of input rows that are inserted into the hash table - they introduce new groups.
Keys in hash table remain stored as row-oriented.
Attachments
Issue Links
- is a child of
-
ARROW-12633 [C++] Query engine umbrella issue
- Open
- links to