Lucene - Core
  1. Lucene - Core
  2. LUCENE-2967

Use linear probing with an additional good bit avalanching function in FST's NodeHash.


    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: None
    • Labels:


      I recently had an interesting discussion with Sebastiano Vigna (fastutil), who suggested that linear probing, given a hash mixing function with good avalanche properties, is a way better method of constructing lookups in associative arrays compared to quadratic probing. Indeed, with linear probing you can implement removals from a hash map without removed slot markers and linear probing has nice properties with respect to modern CPUs (caches). I've reimplemented HPPC's hash maps to use linear probing and we observed a nice speedup (the same applies for fastutils of course).

      This patch changes NodeHash's implementation to use linear probing. The code is a bit simpler (I think . I also moved the load factor to a constant – 0.5 seems like a generous load factor, especially if we allow large FSTs to be built. I don't see any significant speedup in constructing large automata, but there is no slowdown either (I checked on one machine only for now, but will verify on other machines too).


        Uwe Schindler made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Dawid Weiss made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Lucene Fields [New]
        Resolution Won't Fix [ 2 ]
        Dawid Weiss made changes -
        Field Original Value New Value
        Attachment LUCENE-2967.patch [ 12473674 ]
        Dawid Weiss created issue -


          • Assignee:
            Dawid Weiss
            Dawid Weiss
          • Votes:
            0 Vote for this issue
            0 Start watching this issue


            • Created: