Lucene - Core
  1. Lucene - Core
  2. LUCENE-2357

Reduce transient RAM usage while merging by using packed ints array for docID re-mapping

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA, 5.0
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      We allocate this int[] to remap docIDs due to compaction of deleted ones.

      This uses alot of RAM for large segment merges, and can fail to allocate due to fragmentation on 32 bit JREs.

      Now that we have packed ints, a simple fix would be to use a packed int array... and maybe instead of storing abs docID in the mapping, we could store the number of del docs seen so far (so the remap would do a lookup then a subtract). This may add some CPU cost to merging but should bring down transient RAM usage quite a bit.

      1. LUCENE-2357.patch
        15 kB
        Adrien Grand
      2. LUCENE-2357.patch
        16 kB
        Michael McCandless
      3. LUCENE-2357.patch
        16 kB
        Adrien Grand
      4. LUCENE-2357.patch
        17 kB
        Michael McCandless
      5. LUCENE-2357.patch
        17 kB
        Adrien Grand

        Issue Links

          Activity

            People

            • Assignee:
              Adrien Grand
              Reporter:
              Michael McCandless
            • Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development