Details
-
Improvement
-
Status: Closed
-
Trivial
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
I recently had an interesting discussion with Sebastiano Vigna (fastutil), who suggested that linear probing, given a hash mixing function with good avalanche properties, is a way better method of constructing lookups in associative arrays compared to quadratic probing. Indeed, with linear probing you can implement removals from a hash map without removed slot markers and linear probing has nice properties with respect to modern CPUs (caches). I've reimplemented HPPC's hash maps to use linear probing and we observed a nice speedup (the same applies for fastutils of course).
This patch changes NodeHash's implementation to use linear probing. The code is a bit simpler (I think . I also moved the load factor to a constant – 0.5 seems like a generous load factor, especially if we allow large FSTs to be built. I don't see any significant speedup in constructing large automata, but there is no slowdown either (I checked on one machine only for now, but will verify on other machines too).