Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2357

Reduce transient RAM usage while merging by using packed ints array for docID re-mapping

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA, 6.0
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      We allocate this int[] to remap docIDs due to compaction of deleted ones.

      This uses alot of RAM for large segment merges, and can fail to allocate due to fragmentation on 32 bit JREs.

      Now that we have packed ints, a simple fix would be to use a packed int array... and maybe instead of storing abs docID in the mapping, we could store the number of del docs seen so far (so the remap would do a lookup then a subtract). This may add some CPU cost to merging but should bring down transient RAM usage quite a bit.

        Attachments

        1. LUCENE-2357.patch
          17 kB
          Adrien Grand
        2. LUCENE-2357.patch
          17 kB
          Michael McCandless
        3. LUCENE-2357.patch
          16 kB
          Adrien Grand
        4. LUCENE-2357.patch
          16 kB
          Michael McCandless
        5. LUCENE-2357.patch
          15 kB
          Adrien Grand

          Issue Links

            Activity

              People

              • Assignee:
                jpountz Adrien Grand
                Reporter:
                mikemccand Michael McCandless
              • Votes:
                2 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: