Lucene - Core
  1. Lucene - Core
  2. LUCENE-2967

Use linear probing with an additional good bit avalanching function in FST's NodeHash.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: None
    • Labels:
      None

      Description

      I recently had an interesting discussion with Sebastiano Vigna (fastutil), who suggested that linear probing, given a hash mixing function with good avalanche properties, is a way better method of constructing lookups in associative arrays compared to quadratic probing. Indeed, with linear probing you can implement removals from a hash map without removed slot markers and linear probing has nice properties with respect to modern CPUs (caches). I've reimplemented HPPC's hash maps to use linear probing and we observed a nice speedup (the same applies for fastutils of course).

      This patch changes NodeHash's implementation to use linear probing. The code is a bit simpler (I think . I also moved the load factor to a constant – 0.5 seems like a generous load factor, especially if we allow large FSTs to be built. I don't see any significant speedup in constructing large automata, but there is no slowdown either (I checked on one machine only for now, but will verify on other machines too).

        Activity

        Hide
        Dawid Weiss added a comment -

        Linear probing in NodeHash.

        Show
        Dawid Weiss added a comment - Linear probing in NodeHash.
        Hide
        Michael McCandless added a comment -

        Hmm, unfortunately, I'm seeing the patch make FST building slower, at
        least in my env/test set. I built FST for the 38M wikipedia terms.

        I ran 6 times each, alternating trunk & patch.

        I also turned off saving the FST, and ran -noverify, so I'm only
        measuring time to build it. I run java -Xmx2g -Xms2g -Xbatch, and
        measure wall clock time.

        Times on trunk (seconds):

          43.795
          43.493
          44.343
          44.045
          43.645
          43.846
        

        Times w/ patch:

          46.595
          47.751
          47.901
          47.901
          47.901
          47.700
        

        We could also try less generous load factors...

        Show
        Michael McCandless added a comment - Hmm, unfortunately, I'm seeing the patch make FST building slower, at least in my env/test set. I built FST for the 38M wikipedia terms. I ran 6 times each, alternating trunk & patch. I also turned off saving the FST, and ran -noverify, so I'm only measuring time to build it. I run java -Xmx2g -Xms2g -Xbatch, and measure wall clock time. Times on trunk (seconds): 43.795 43.493 44.343 44.045 43.645 43.846 Times w/ patch: 46.595 47.751 47.901 47.901 47.901 47.700 We could also try less generous load factors...
        Hide
        Dawid Weiss added a comment -

        Yes, now I see this difference on the 38M too:

        trunk:

        56.462
        55.725
        55.544
        55.522
        

        w/patch:

        59.9
        59.6
        

        I'll see if I can find out the problem here; I assume the collision ratio should be nearly identical... but who knows. This is of no priority, but interesting stuff. I'll close if I can't get it better than the trunk version.

        Show
        Dawid Weiss added a comment - Yes, now I see this difference on the 38M too: trunk: 56.462 55.725 55.544 55.522 w/patch: 59.9 59.6 I'll see if I can find out the problem here; I assume the collision ratio should be nearly identical... but who knows. This is of no priority, but interesting stuff. I'll close if I can't get it better than the trunk version.
        Hide
        Dawid Weiss added a comment -

        I spent some time on this. It's quite fascinating: the number of collisions for the default probing is smaller than:

        a) linear probing with murmurhash mix of the original hash
        b) linear probing without murmurhash mix (start from raw hash only).

        Curiously, the number of collisions for (b) is smaller than for (a) – this could be explained if we assume bits are spread evently throughout the entire 32-bit range after murmurhash, so after masking to table size there should be more collisions on lower bits compared to a raw hash (this would have more collisions on upper bits and fewer on lower bits because it is multiplicative... or at least I think so).

        Anyway, I tried many different versions and I don't see any significant difference in favor of linear probing here. Measured the GC overhead during my tests too, but it is not the primary factor contributing to the total cost of constructing the FST (about 3-5% of the total time, running in parallel, typically).

        Show
        Dawid Weiss added a comment - I spent some time on this. It's quite fascinating: the number of collisions for the default probing is smaller than: a) linear probing with murmurhash mix of the original hash b) linear probing without murmurhash mix (start from raw hash only). Curiously, the number of collisions for (b) is smaller than for (a) – this could be explained if we assume bits are spread evently throughout the entire 32-bit range after murmurhash, so after masking to table size there should be more collisions on lower bits compared to a raw hash (this would have more collisions on upper bits and fewer on lower bits because it is multiplicative... or at least I think so). Anyway, I tried many different versions and I don't see any significant difference in favor of linear probing here. Measured the GC overhead during my tests too, but it is not the primary factor contributing to the total cost of constructing the FST (about 3-5% of the total time, running in parallel, typically).

          People

          • Assignee:
            Dawid Weiss
            Reporter:
            Dawid Weiss
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development