Lucene - Core
  1. Lucene - Core
  2. LUCENE-2805

SegmentInfos shouldn't blindly increment version on commit

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      SegmentInfos currently increments version on the assumption that there are always changes.

      But, both DirReader and IW are more careful about tracking whether there are changes. DirReader has hasChanges and IW has changeCount. I think these classes should notify the SIS when there are in fact changes; this will fix the case Simon hit on fixing LUCENE-2082 when the NRT reader thought there were changes, but in fact there weren't because IW simply committed the exact SIS it already had.

      1. LUCENE-2805.patch
        3 kB
        Michael McCandless
      2. LUCENE-2805.patch
        6 kB
        Simon Willnauer
      3. LUCENE-2805.patch
        11 kB
        Simon Willnauer
      4. LUCENE-2805_3x.patch
        12 kB
        Simon Willnauer

        Issue Links

          Activity

          Michael McCandless created issue -
          Hide
          Michael McCandless added a comment -

          Duh, make that LUCENE-2802.

          Show
          Michael McCandless added a comment - Duh, make that LUCENE-2802 .
          Hide
          Michael McCandless added a comment -

          Attached first cut patch, just moving the .version++ responsibility into DirReader/IW.

          But I haven't verified if it fixes the case in LUCENE-2802.

          Show
          Michael McCandless added a comment - Attached first cut patch, just moving the .version++ responsibility into DirReader/IW. But I haven't verified if it fixes the case in LUCENE-2802 .
          Michael McCandless made changes -
          Field Original Value New Value
          Attachment LUCENE-2805.patch [ 12465643 ]
          Simon Willnauer made changes -
          Link This issue blocks LUCENE-2802 [ LUCENE-2802 ]
          Hide
          Simon Willnauer added a comment -

          this should be fixed together with LUCENE-2802

          Show
          Simon Willnauer added a comment - this should be fixed together with LUCENE-2802
          Hide
          Simon Willnauer added a comment -

          here is a slightly updated patch that removes the blind increment from DefaultSegmentInfosWriter, adds #changed() calles to contrib classes and adds a missing #changed() call to IW#deleteAll()

          test pass also for LUCENE-2802

          Show
          Simon Willnauer added a comment - here is a slightly updated patch that removes the blind increment from DefaultSegmentInfosWriter, adds #changed() calles to contrib classes and adds a missing #changed() call to IW#deleteAll() test pass also for LUCENE-2802
          Simon Willnauer made changes -
          Attachment LUCENE-2805.patch [ 12465658 ]
          Hide
          Michael McCandless added a comment -

          Nice catches on all the other places Simon! Patch looks good.

          Except I think we can now simplify IW.nrtIsCurrent to this?

            synchronized boolean nrtIsCurrent(SegmentInfos infos) {
              return segmentInfos.version == infos.version && !docWriter.anyChanges();
            }
          
          Show
          Michael McCandless added a comment - Nice catches on all the other places Simon! Patch looks good. Except I think we can now simplify IW.nrtIsCurrent to this? synchronized boolean nrtIsCurrent(SegmentInfos infos) { return segmentInfos.version == infos.version && !docWriter.anyChanges(); }
          Hide
          Simon Willnauer added a comment -

          Except I think we can now simplify IW.nrtIsCurrent to this?

          Yay, this make the nrtIsCurrent check simpler too. Lemme see if anything breaks, I think that could cause a test to fail since we compared the seg infos though.

          Show
          Simon Willnauer added a comment - Except I think we can now simplify IW.nrtIsCurrent to this? Yay, this make the nrtIsCurrent check simpler too. Lemme see if anything breaks, I think that could cause a test to fail since we compared the seg infos though.
          Hide
          Simon Willnauer added a comment -

          next iteration - I simplified nrtIsCurrent which is nice and all tests pass. Yet, I couldn't get any test failing with this version

          synchronized boolean nrtIsCurrent(SegmentInfos infos) {
              return segmentInfos.version == infos.version;
            }
          

          so I added another test that crashes immediately if we don't check the docWriter for changes. I also added a changes entry to runtime behavior section since the version number are different to previos versions.

          I also had to update an assert in TestIndexWriterReader#testAddIndexes since we now don't increment the version on a commit if there are no changes but generation might have been changed though.

          I think we are good to go.

          Show
          Simon Willnauer added a comment - next iteration - I simplified nrtIsCurrent which is nice and all tests pass. Yet, I couldn't get any test failing with this version synchronized boolean nrtIsCurrent(SegmentInfos infos) { return segmentInfos.version == infos.version; } so I added another test that crashes immediately if we don't check the docWriter for changes. I also added a changes entry to runtime behavior section since the version number are different to previos versions. I also had to update an assert in TestIndexWriterReader#testAddIndexes since we now don't increment the version on a commit if there are no changes but generation might have been changed though. I think we are good to go.
          Simon Willnauer made changes -
          Attachment LUCENE-2805.patch [ 12465690 ]
          Simon Willnauer made changes -
          Assignee Simon Willnauer [ simonw ]
          Hide
          Simon Willnauer added a comment -

          Committed revision 1043148.

          I will backport later today

          Show
          Simon Willnauer added a comment - Committed revision 1043148. I will backport later today
          Hide
          Simon Willnauer added a comment -

          here is a patch against 3.x. I needed to change an assert in the backwards tests so that is potentially a bw break and I am not 100% sure how we handle this right now. Is changing the bw test ok in this case? If its really a bw break I guess we should also list it in the BW section in CHANGES.txt

          Show
          Simon Willnauer added a comment - here is a patch against 3.x. I needed to change an assert in the backwards tests so that is potentially a bw break and I am not 100% sure how we handle this right now. Is changing the bw test ok in this case? If its really a bw break I guess we should also list it in the BW section in CHANGES.txt
          Simon Willnauer made changes -
          Attachment LUCENE-2805_3x.patch [ 12465772 ]
          Simon Willnauer committed 1043289 (22 files)
          Reviews: none

          Added DocValuesIndexing#addIndexes() and merged LUCENE-2802 & LUCENE-2805

          Lucene docvalues
          Hide
          Michael McCandless added a comment -

          needed to change an assert in the backwards tests so that is potentially a bw break and I am not 100% sure how we handle this right now. Is changing the bw test ok in this case? If its really a bw break I guess we should also list it in the BW section in CHANGES.txt

          It's fine to change the test – and the CHANGES entry already explains this? (Though maybe fix the CHANGES to explain that this now means some readers will in fact return "true" from isCurrent when then before incorrectly returned "false"?).

          Show
          Michael McCandless added a comment - needed to change an assert in the backwards tests so that is potentially a bw break and I am not 100% sure how we handle this right now. Is changing the bw test ok in this case? If its really a bw break I guess we should also list it in the BW section in CHANGES.txt It's fine to change the test – and the CHANGES entry already explains this? (Though maybe fix the CHANGES to explain that this now means some readers will in fact return "true" from isCurrent when then before incorrectly returned "false"?).
          Hide
          Simon Willnauer added a comment -

          Backported in revision 1043406.

          Show
          Simon Willnauer added a comment - Backported in revision 1043406.
          Simon Willnauer made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Mark Thomas made changes -
          Workflow jira [ 12539473 ] Default workflow, editable Closed status [ 12564087 ]
          Mark Thomas made changes -
          Workflow Default workflow, editable Closed status [ 12564087 ] jira [ 12585547 ]
          Hide
          Grant Ingersoll added a comment -

          Bulk close for 3.1

          Show
          Grant Ingersoll added a comment - Bulk close for 3.1
          Grant Ingersoll made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Simon Willnauer
              Reporter:
              Michael McCandless
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development