Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2967

Use linear probing with an additional good bit avalanching function in FST's NodeHash.

Details

    • Improvement
    • Status: Closed
    • Trivial
    • Resolution: Won't Fix
    • None
    • 4.0-ALPHA
    • None
    • None

    Description

      I recently had an interesting discussion with Sebastiano Vigna (fastutil), who suggested that linear probing, given a hash mixing function with good avalanche properties, is a way better method of constructing lookups in associative arrays compared to quadratic probing. Indeed, with linear probing you can implement removals from a hash map without removed slot markers and linear probing has nice properties with respect to modern CPUs (caches). I've reimplemented HPPC's hash maps to use linear probing and we observed a nice speedup (the same applies for fastutils of course).

      This patch changes NodeHash's implementation to use linear probing. The code is a bit simpler (I think . I also moved the load factor to a constant – 0.5 seems like a generous load factor, especially if we allow large FSTs to be built. I don't see any significant speedup in constructing large automata, but there is no slowdown either (I checked on one machine only for now, but will verify on other machines too).

      Attachments

        1. LUCENE-2967.patch
          5 kB
          Dawid Weiss

        Activity

          People

            dweiss Dawid Weiss
            dweiss Dawid Weiss
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: