Details

Type: Improvement

Status: Closed

Priority: Trivial

Resolution: Won't Fix

Affects Version/s: None

Fix Version/s: 4.0ALPHA

Component/s: None

Labels:None
Description
I recently had an interesting discussion with Sebastiano Vigna (fastutil), who suggested that linear probing, given a hash mixing function with good avalanche properties, is a way better method of constructing lookups in associative arrays compared to quadratic probing. Indeed, with linear probing you can implement removals from a hash map without removed slot markers and linear probing has nice properties with respect to modern CPUs (caches). I've reimplemented HPPC's hash maps to use linear probing and we observed a nice speedup (the same applies for fastutils of course).
This patch changes NodeHash's implementation to use linear probing. The code is a bit simpler (I think . I also moved the load factor to a constant – 0.5 seems like a generous load factor, especially if we allow large FSTs to be built. I don't see any significant speedup in constructing large automata, but there is no slowdown either (I checked on one machine only for now, but will verify on other machines too).
Linear probing in NodeHash.