Lucene - Core
  1. Lucene - Core
  2. LUCENE-2967

Use linear probing with an additional good bit avalanching function in FST's NodeHash.


    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: None
    • Labels:


      I recently had an interesting discussion with Sebastiano Vigna (fastutil), who suggested that linear probing, given a hash mixing function with good avalanche properties, is a way better method of constructing lookups in associative arrays compared to quadratic probing. Indeed, with linear probing you can implement removals from a hash map without removed slot markers and linear probing has nice properties with respect to modern CPUs (caches). I've reimplemented HPPC's hash maps to use linear probing and we observed a nice speedup (the same applies for fastutils of course).

      This patch changes NodeHash's implementation to use linear probing. The code is a bit simpler (I think . I also moved the load factor to a constant – 0.5 seems like a generous load factor, especially if we allow large FSTs to be built. I don't see any significant speedup in constructing large automata, but there is no slowdown either (I checked on one machine only for now, but will verify on other machines too).



          • Assignee:
            Dawid Weiss
            Dawid Weiss
          • Votes:
            0 Vote for this issue
            0 Start watching this issue


            • Created: