Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Our hash table collects some useful stats about collisions and travel length, but then we don't do anything to expose them: https://github.com/apache/impala/blob/540611e863fe99b3d3ae35f8b94a745a68b9eba2/be/src/exec/hash-table.h#L989
We should add some of them to the profile, maybe:
- the number of probes
- the average travel length per probe
- the number of hash collisions
- (optional) the number of hash table resizes. We already have the hash table size and the resize time, which I think is sufficient to debug most problems with resizes.