Lucene - Core
  1. Lucene - Core
  2. LUCENE-2680

Improve how IndexWriter flushes deletes against existing segments

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      IndexWriter buffers up all deletes (by Term and Query) and only
      applies them if 1) commit or NRT getReader() is called, or 2) a merge
      is about to kickoff.

      We do this because, for a large index, it's very costly to open a
      SegmentReader for every segment in the index. So we defer as long as
      we can. We do it just before merge so that the merge can eliminate
      the deleted docs.

      But, most merges are small, yet in a big index we apply deletes to all
      of the segments, which is really very wasteful.

      Instead, we should only apply the buffered deletes to the segments
      that are about to be merged, and keep the buffer around for the
      remaining segments.

      I think it's not so hard to do; we'd have to have generations of
      pending deletions, because the newly merged segment doesn't need the
      same buffered deletions applied again. So every time a merge kicks
      off, we pinch off the current set of buffered deletions, open a new
      set (the next generation), and record which segment was created as of
      which generation.

      This should be a very sizable gain for large indices that mix
      deletes, though, less so in flex since opening the terms index is much
      faster.

      1. LUCENE-2680.patch
        33 kB
        Jason Rutherglen
      2. LUCENE-2680.patch
        30 kB
        Jason Rutherglen
      3. LUCENE-2680.patch
        37 kB
        Jason Rutherglen
      4. LUCENE-2680.patch
        45 kB
        Jason Rutherglen
      5. LUCENE-2680.patch
        42 kB
        Jason Rutherglen
      6. LUCENE-2680.patch
        41 kB
        Jason Rutherglen
      7. LUCENE-2680.patch
        43 kB
        Jason Rutherglen
      8. LUCENE-2680.patch
        44 kB
        Jason Rutherglen
      9. LUCENE-2680.patch
        42 kB
        Jason Rutherglen
      10. LUCENE-2680.patch
        9 kB
        Jason Rutherglen
      11. LUCENE-2680.patch
        19 kB
        Jason Rutherglen
      12. LUCENE-2680.patch
        19 kB
        Jason Rutherglen
      13. LUCENE-2680.patch
        56 kB
        Jason Rutherglen
      14. LUCENE-2680.patch
        57 kB
        Jason Rutherglen
      15. LUCENE-2680.patch
        124 kB
        Michael McCandless
      16. LUCENE-2680.patch
        124 kB
        Michael McCandless
      17. LUCENE-2680.patch
        141 kB
        Michael McCandless

        Issue Links

          Activity

          Gavin made changes -
          Link This issue is depended upon by LUCENE-2655 [ LUCENE-2655 ]
          Gavin made changes -
          Link This issue blocks LUCENE-2655 [ LUCENE-2655 ]
          Grant Ingersoll made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Mark Thomas made changes -
          Workflow Default workflow, editable Closed status [ 12564322 ] jira [ 12584852 ]
          Mark Thomas made changes -
          Workflow jira [ 12522075 ] Default workflow, editable Closed status [ 12564322 ]
          Michael McCandless made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Michael McCandless made changes -
          Attachment LUCENE-2680.patch [ 12465908 ]
          Michael McCandless made changes -
          Attachment LUCENE-2680.patch [ 12465894 ]
          Michael McCandless made changes -
          Fix Version/s 3.1 [ 12314822 ]
          Michael McCandless made changes -
          Assignee Michael McCandless [ mikemccand ]
          Michael McCandless made changes -
          Attachment LUCENE-2680.patch [ 12465826 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12464980 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12464979 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12464787 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12464786 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12460150 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12459087 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12459030 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12459028 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12459025 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12459011 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12458859 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12458770 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12458700 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-2680.patch [ 12458565 ]
          Jason Rutherglen made changes -
          Field Original Value New Value
          Link This issue blocks LUCENE-2655 [ LUCENE-2655 ]
          Michael McCandless created issue -

            People

            • Assignee:
              Michael McCandless
              Reporter:
              Michael McCandless
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development