Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
Impala 2.2
-
None
Description
Currently we maintain two version of the hash-based aggregations and joins, the (old) unpartitioned ones and the partitioned and spillable ones. The main reason we had to keep the old version it was because of the additional memory PAGG and PHJ were consuming in small-ish aggregations and joins.
But maintaining this extra code is cumbersome, error-prone and tricky to test. For example, the new PHJ supports functionality (join modes) that the old one does not support, which means that some times even though we disable PHJ we still use it, see IMPALA-1751.
If we manage to make PAGG and PHJ to consume as much memory as their unpartitioned counterparts in small-ish inputs (or a few MBs more) then there is no reason we should keep the old AGG and HJ nodes around.
Attachments
Issue Links
- blocks
-
IMPALA-1751 Setting -enable_partitioned_hash_join=false still allows spilling for some join types.
- Resolved
- depends upon
-
IMPALA-3200 Replace BufferedBlockMgr with new buffer pool
- Resolved
- is related to
-
IMPALA-2243 SEGV in ScopedTimer during old agg node Open()
- Resolved
-
IMPALA-1953 Add query option to allow enable/disable PHJ/PAGG per query
- Resolved
-
IMPALA-2852 Remove old hash table
- Resolved