Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7258

Tune DocIdSetBuilder allocation rate

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Fixed
    • None
    • 6.1, 7.0
    • modules/spatial
    • None
    • New

    Description

      LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but didn't actually reduce garbage generation for my Solr index.

      Since something like 40% of my garbage (by space) is now attributed to DocIdSetBuilder.growBuffer, I charted a few different allocation strategies to see if I could tune things more.

      See here: http://i.imgur.com/7sXLAYv.jpg
      The jump-then-flatline at the right would be where DocIdSetBuilder gives up and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index curve/cutoff looked similar)

      Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is terrible from an allocation standpoint if you're doing a lot of expansions, and is especially terrible when used to build a short-lived data structure like this one.
      By the time it goes with the FBS, it's allocated around twice as much memory for the buffer as it would have needed for just the FBS.

      Attachments

        1. LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch
          7 kB
          Jeff Wartes
        2. LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch
          6 kB
          Jeff Wartes
        3. LUCENE-7258-expanding.patch
          8 kB
          Adrien Grand
        4. LUCENE-7258.patch
          10 kB
          Adrien Grand
        5. allocation_plot.jpg
          39 kB
          Jeff Wartes

        Activity

          People

            Unassigned Unassigned
            jwartes Jeff Wartes
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: