Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2357

Reduce transient RAM usage while merging by using packed ints array for docID re-mapping

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 4.0-ALPHA, 6.0
    • core/index
    • None
    • New

    Description

      We allocate this int[] to remap docIDs due to compaction of deleted ones.

      This uses alot of RAM for large segment merges, and can fail to allocate due to fragmentation on 32 bit JREs.

      Now that we have packed ints, a simple fix would be to use a packed int array... and maybe instead of storing abs docID in the mapping, we could store the number of del docs seen so far (so the remap would do a lookup then a subtract). This may add some CPU cost to merging but should bring down transient RAM usage quite a bit.

      Attachments

        1. LUCENE-2357.patch
          15 kB
          Adrien Grand
        2. LUCENE-2357.patch
          16 kB
          Michael McCandless
        3. LUCENE-2357.patch
          16 kB
          Adrien Grand
        4. LUCENE-2357.patch
          17 kB
          Michael McCandless
        5. LUCENE-2357.patch
          17 kB
          Adrien Grand

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jpountz Adrien Grand
            mikemccand Michael McCandless
            Votes:
            2 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment