Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5931

DirectoryReader.openIfChanged(oldReader, commit) incorrectly assumes given commit point has deletes/field updates

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 4.6.1
    • Fix Version/s: 7.0, 6.2
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      StandardDirectoryReader assumes that the segments from commit point have deletes, when they may not, yet the original SegmentReader for the segment that we are trying to reuse does. This is evident when running attached JUnit test case with asserts enabled (default):

      java.lang.AssertionError
      	at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:188)
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:326)
      	at org.apache.lucene.index.StandardDirectoryReader$2.doBody(StandardDirectoryReader.java:320)
      	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:702)
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenFromCommit(StandardDirectoryReader.java:315)
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:311)
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:262)
      	at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:183)
      

      or, if asserts are disabled then it falls through into NPE:

      java.lang.NullPointerException
      	at java.io.File.<init>(File.java:305)
      	at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:80)
      	at org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:327)
      	at org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:90)
      	at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:131)
      	at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:194)
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:326)
      	at org.apache.lucene.index.StandardDirectoryReader$2.doBody(StandardDirectoryReader.java:320)
      	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:702)
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenFromCommit(StandardDirectoryReader.java:315)
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:311)
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:262)
      	at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:183)
      
      1. CommitReuseTest.java
        3 kB
        Vitaly Funstein
      2. LUCENE-5931.patch
        17 kB
        Michael McCandless
      3. LUCENE-5931.patch
        18 kB
        Michael McCandless
      4. LUCENE-5931.patch
        10 kB
        Michael McCandless
      5. LUCENE-5931.patch
        9 kB
        Robert Muir
      6. LUCENE-5931.patch
        2 kB
        Michael McCandless

        Activity

        Hide
        mikemccand Michael McCandless added a comment -

        Patch, starting from Vitaly's test case (thank you!) and folding into Lucene's tests ... it fails with this on trunk:

        1) testReopenReaderToOlderCommit(org.apache.lucene.index.TestDirectoryReaderReopen)
        java.lang.IllegalStateException: same segment _0 has invalid changes; likely you are re-opening a reader after illegally removing index files yourself and building a new index in their place.  Use IndexWriter.deleteAll or OpenMode.CREATE instead
        	at __randomizedtesting.SeedInfo.seed([D3F22B13D5839643:931C8A9673D003F4]:0)
        	at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:190)
        	at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:323)
        	at org.apache.lucene.index.StandardDirectoryReader$2.doBody(StandardDirectoryReader.java:317)
        	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:649)
        	at org.apache.lucene.index.StandardDirectoryReader.doOpenFromCommit(StandardDirectoryReader.java:312)
        	at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:308)
        	at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:259)
        	at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:137)
        	at org.apache.lucene.index.TestDirectoryReaderReopen.testReopenReaderToOlderCommit(TestDirectoryReaderReopen.java:824)
        
        Show
        mikemccand Michael McCandless added a comment - Patch, starting from Vitaly's test case (thank you!) and folding into Lucene's tests ... it fails with this on trunk: 1) testReopenReaderToOlderCommit(org.apache.lucene.index.TestDirectoryReaderReopen) java.lang.IllegalStateException: same segment _0 has invalid changes; likely you are re-opening a reader after illegally removing index files yourself and building a new index in their place. Use IndexWriter.deleteAll or OpenMode.CREATE instead at __randomizedtesting.SeedInfo.seed([D3F22B13D5839643:931C8A9673D003F4]:0) at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:190) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:323) at org.apache.lucene.index.StandardDirectoryReader$2.doBody(StandardDirectoryReader.java:317) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:649) at org.apache.lucene.index.StandardDirectoryReader.doOpenFromCommit(StandardDirectoryReader.java:312) at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:308) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:259) at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:137) at org.apache.lucene.index.TestDirectoryReaderReopen.testReopenReaderToOlderCommit(TestDirectoryReaderReopen.java:824)
        Hide
        rcmuir Robert Muir added a comment -

        Patch: I think the use-case is cool and it should be supported: its just adding an 'if' and removing the current exception (which is geared at protecting some user who manually rm -rf's files from their index.)

        I improved the test a bit to ensure that cores are shared and also tested the dv updates case.

        Show
        rcmuir Robert Muir added a comment - Patch: I think the use-case is cool and it should be supported: its just adding an 'if' and removing the current exception (which is geared at protecting some user who manually rm -rf's files from their index.) I improved the test a bit to ensure that cores are shared and also tested the dv updates case.
        Hide
        mikemccand Michael McCandless added a comment -

        Thanks Rob, patch looks great, except: I think we can keep the illegalDocCountChange safety? I think I could make a test case to trip that ...

        Show
        mikemccand Michael McCandless added a comment - Thanks Rob, patch looks great, except: I think we can keep the illegalDocCountChange safety? I think I could make a test case to trip that ...
        Hide
        rcmuir Robert Muir added a comment -

        Can you make a test change to trip it without manually removing files from your index?

        Show
        rcmuir Robert Muir added a comment - Can you make a test change to trip it without manually removing files from your index?
        Hide
        mikemccand Michael McCandless added a comment -

        Can you make a test change to trip it without manually removing files from your index?

        I don' t think so ... the only way I know of this happening is if an app has a reader open, then removes the index, rebuilds it, then tries to openIfChanged the reader.

        We can remove the defensive safety if you really want to, but we put it in last time when this happened and it sure looked like index corruption... it's nice not to have false scares even if the app is doing something it shouldn't...

        Show
        mikemccand Michael McCandless added a comment - Can you make a test change to trip it without manually removing files from your index? I don' t think so ... the only way I know of this happening is if an app has a reader open, then removes the index, rebuilds it, then tries to openIfChanged the reader. We can remove the defensive safety if you really want to, but we put it in last time when this happened and it sure looked like index corruption... it's nice not to have false scares even if the app is doing something it shouldn't...
        Hide
        rcmuir Robert Muir added a comment -

        Yes, I really want to. We can't let such abuse prevent real features, thats just wrong.

        If you want safety against deleting files, maybe look at NIO.2 WatchService. This is general and would give you notification when such things happen rather than having a hack for one particular user's mistake.

        Show
        rcmuir Robert Muir added a comment - Yes, I really want to. We can't let such abuse prevent real features, thats just wrong. If you want safety against deleting files, maybe look at NIO.2 WatchService. This is general and would give you notification when such things happen rather than having a hack for one particular user's mistake.
        Hide
        rcmuir Robert Muir added a comment -

        We can remove the defensive safety if you really want to, but we put it in last time when this happened and it sure looked like index corruption

        It is index corruption though. The user went and manually corrupted their index. Why hide that Mike?

        Show
        rcmuir Robert Muir added a comment - We can remove the defensive safety if you really want to, but we put it in last time when this happened and it sure looked like index corruption It is index corruption though. The user went and manually corrupted their index. Why hide that Mike?
        Hide
        mikemccand Michael McCandless added a comment -

        I think we can do both (keep our best effort check, and allow reopening to "older" commit point) ... here's a patch.

        Show
        mikemccand Michael McCandless added a comment - I think we can do both (keep our best effort check, and allow reopening to "older" commit point) ... here's a patch.
        Hide
        vfunstein Vitaly Funstein added a comment - - edited

        Mike/Robert,

        I have a follow-up question. I have backported the fix to 4.6 and now I believe I am seeing another serious issue here.

        If the old reader passed in to DirectoryReader.openIfChanged(DirectoryReader, IndexCommit) is actually an NRT reader, then it seems that if there is unflushed/uncommitted data in the associated writer's buffers, in particular deletes, the returned reader will see those changes - thus violating the intent of opening the index at just the commit point we wanted, frozen in time. Here's my original test case modified to show the problem:

        import static org.junit.Assert.assertEquals;
        import static org.junit.Assert.assertTrue;
        
        import org.apache.lucene.document.Document;
        import org.apache.lucene.document.Field.Store;
        import org.apache.lucene.document.StringField;
        import org.apache.lucene.index.DirectoryReader;
        import org.apache.lucene.index.IndexCommit;
        import org.apache.lucene.index.IndexWriter;
        import org.apache.lucene.index.IndexWriterConfig;
        import org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy;
        import org.apache.lucene.index.ReaderManager;
        import org.apache.lucene.index.SnapshotDeletionPolicy;
        import org.apache.lucene.index.Term;
        import org.apache.lucene.search.IndexSearcher;
        import org.apache.lucene.search.Query;
        import org.apache.lucene.search.TermQuery;
        import org.apache.lucene.search.TopDocs;
        import org.apache.lucene.store.Directory;
        import org.apache.lucene.store.FSDirectory;
        import org.apache.lucene.util.Version;
        import org.junit.After;
        import org.junit.Before;
        import org.junit.Test;
        
        import java.io.File;
        
        public class CommitReuseTest {
        
          private final File path = new File("indexDir");
          private IndexWriter writer;
          private final SnapshotDeletionPolicy snapshotter = new SnapshotDeletionPolicy(new KeepOnlyLastCommitDeletionPolicy());
         
          @Before
          public void initIndex() throws Exception {
            path.mkdirs();
            IndexWriterConfig idxWriterCfg = new IndexWriterConfig(Version.LUCENE_46, null);
            idxWriterCfg.setIndexDeletionPolicy(snapshotter);
            idxWriterCfg.setInfoStream(System.out);
           
            Directory dir = FSDirectory.open(path);
            writer = new IndexWriter(dir, idxWriterCfg);
           
            writer.commit(); // make sure all index metadata is written out
          }
         
          @After
          public void stop() throws Exception {
            writer.close();
          }
        
          @Test
          public void test() throws Exception {
            Document doc;
            ReaderManager rm = new ReaderManager(writer, true);
           
            // Index some data
            for (int i = 0; i < 100; i++) {
              doc = new Document();
              doc.add(new StringField("key-" + i, "ABC", Store.YES));
              writer.addDocument(doc);
            }
           
            writer.commit();
           
            IndexCommit ic1 = snapshotter.snapshot();
           
            doc = new Document();
            doc.add(new StringField("key-" + 0, "AAA", Store.YES));
            writer.updateDocument(new Term("key-" + 0, "ABC"), doc);
        
            rm.maybeRefreshBlocking();
            DirectoryReader latest = rm.acquire();
            assertTrue(latest.hasDeletions());
           
            // This reader will be used for searching against commit point 1
            DirectoryReader searchReader = DirectoryReader.openIfChanged(latest, ic1);
        //    assertFalse(searchReader.hasDeletions()); // XXX - this fails too!
            rm.release(latest);
           
            IndexSearcher s = new IndexSearcher(searchReader);
            Query q = new TermQuery(new Term("key-0", "ABC"));
            TopDocs td = s.search(q, 10);
            assertEquals(1, td.totalHits);
               
            searchReader.close();
            rm.close();
            snapshotter.release(ic1);
          }
        }
        

        Note, that if I comment out the updateDocument() call, the test passes. Also, if you only have one entry in the index, then it appears that while refreshing the NRT reader, the segment containing just the single delete will be removed, making it look like the test passes:

        IW 0 [Wed Sep 17 22:32:47 PDT 2014; main]: drop 100% deleted segments: _4(4.6):c1/1
        

        This output does not appear when running the code above, unchanged. Hope this helps... I can't make further headway myself though.

        Show
        vfunstein Vitaly Funstein added a comment - - edited Mike/Robert, I have a follow-up question. I have backported the fix to 4.6 and now I believe I am seeing another serious issue here. If the old reader passed in to DirectoryReader.openIfChanged(DirectoryReader, IndexCommit) is actually an NRT reader, then it seems that if there is unflushed/uncommitted data in the associated writer's buffers, in particular deletes, the returned reader will see those changes - thus violating the intent of opening the index at just the commit point we wanted, frozen in time. Here's my original test case modified to show the problem: import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertTrue; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field.Store; import org.apache.lucene.document.StringField; import org.apache.lucene.index.DirectoryReader; import org.apache.lucene.index.IndexCommit; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy; import org.apache.lucene.index.ReaderManager; import org.apache.lucene.index.SnapshotDeletionPolicy; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Query; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopDocs; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; import org.apache.lucene.util.Version; import org.junit.After; import org.junit.Before; import org.junit.Test; import java.io.File; public class CommitReuseTest { private final File path = new File( "indexDir" ); private IndexWriter writer; private final SnapshotDeletionPolicy snapshotter = new SnapshotDeletionPolicy( new KeepOnlyLastCommitDeletionPolicy()); @Before public void initIndex() throws Exception { path.mkdirs(); IndexWriterConfig idxWriterCfg = new IndexWriterConfig(Version.LUCENE_46, null ); idxWriterCfg.setIndexDeletionPolicy(snapshotter); idxWriterCfg.setInfoStream( System .out); Directory dir = FSDirectory.open(path); writer = new IndexWriter(dir, idxWriterCfg); writer.commit(); // make sure all index metadata is written out } @After public void stop() throws Exception { writer.close(); } @Test public void test() throws Exception { Document doc; ReaderManager rm = new ReaderManager(writer, true ); // Index some data for ( int i = 0; i < 100; i++) { doc = new Document(); doc.add( new StringField( "key-" + i, "ABC" , Store.YES)); writer.addDocument(doc); } writer.commit(); IndexCommit ic1 = snapshotter.snapshot(); doc = new Document(); doc.add( new StringField( "key-" + 0, "AAA" , Store.YES)); writer.updateDocument( new Term( "key-" + 0, "ABC" ), doc); rm.maybeRefreshBlocking(); DirectoryReader latest = rm.acquire(); assertTrue(latest.hasDeletions()); // This reader will be used for searching against commit point 1 DirectoryReader searchReader = DirectoryReader.openIfChanged(latest, ic1); // assertFalse(searchReader.hasDeletions()); // XXX - this fails too! rm.release(latest); IndexSearcher s = new IndexSearcher(searchReader); Query q = new TermQuery( new Term( "key-0" , "ABC" )); TopDocs td = s.search(q, 10); assertEquals(1, td.totalHits); searchReader.close(); rm.close(); snapshotter.release(ic1); } } Note, that if I comment out the updateDocument() call, the test passes. Also, if you only have one entry in the index, then it appears that while refreshing the NRT reader, the segment containing just the single delete will be removed, making it look like the test passes: IW 0 [Wed Sep 17 22:32:47 PDT 2014; main]: drop 100% deleted segments: _4(4.6):c1/1 This output does not appear when running the code above, unchanged. Hope this helps... I can't make further headway myself though.
        Hide
        mikemccand Michael McCandless added a comment -

        OK, here's another iteration on the patch ...

        I have backported the fix to 4.6 and now I believe I am seeing another serious issue here.

        Thanks Vitaly, that's a good catch (opening to an older commit point
        from an NRT reader that's carrying deletes in RAM), and we don't
        properly handle that case because SegmentReader doesn't know whether
        its liveDocs came from disk (matching the delGen from the
        SegmentCommitInfo) or were carried in RAM from IndexWriter.

        I fixed this in this new patch by recording whether the
        SegmentReader is NRT, and then fixing the reopen logic to load
        liveDocs from disk in that case. I folded in your example as another
        test case (failed at first but now passes).

        Also I moved the "best-effort detection of rm -rf index" up
        higher... it's currently too low now, i.e. will fail to detect some
        cases that it should.

        Show
        mikemccand Michael McCandless added a comment - OK, here's another iteration on the patch ... I have backported the fix to 4.6 and now I believe I am seeing another serious issue here. Thanks Vitaly, that's a good catch (opening to an older commit point from an NRT reader that's carrying deletes in RAM), and we don't properly handle that case because SegmentReader doesn't know whether its liveDocs came from disk (matching the delGen from the SegmentCommitInfo) or were carried in RAM from IndexWriter. I fixed this in this new patch by recording whether the SegmentReader is NRT, and then fixing the reopen logic to load liveDocs from disk in that case. I folded in your example as another test case (failed at first but now passes). Also I moved the "best-effort detection of rm -rf index" up higher... it's currently too low now, i.e. will fail to detect some cases that it should.
        Hide
        vfunstein Vitaly Funstein added a comment -

        Michael,

        Your updated patch definitely fixes the issue. But I just wanted to understand why deletes are so special, in that - if I don't have any buffered deletes for the segment, but new documents only, the reused reader instance won't pick them up, even without the fix in place. This is because liveDocs won't capture unflushed doc ids?

        Show
        vfunstein Vitaly Funstein added a comment - Michael, Your updated patch definitely fixes the issue. But I just wanted to understand why deletes are so special, in that - if I don't have any buffered deletes for the segment, but new documents only, the reused reader instance won't pick them up, even without the fix in place. This is because liveDocs won't capture unflushed doc ids?
        Hide
        mikemccand Michael McCandless added a comment -

        Hi Vitaly Funstein, I'm glad the patched fixed your issue ... thanks for
        testing.

        Deletes are special because 1) they are allowed to change after a
        segment is written, and 2) IndexWriter carries them in RAM, rather
        than writing to / reading from the filesystem, so the only
        "point-in-time-ness" is maintained by IW being careful to not change a
        liveDocs already paired up with a SegmentReader.

        Show
        mikemccand Michael McCandless added a comment - Hi Vitaly Funstein , I'm glad the patched fixed your issue ... thanks for testing. Deletes are special because 1) they are allowed to change after a segment is written, and 2) IndexWriter carries them in RAM, rather than writing to / reading from the filesystem, so the only "point-in-time-ness" is maintained by IW being careful to not change a liveDocs already paired up with a SegmentReader.
        Hide
        mikemccand Michael McCandless added a comment -

        Woops, this almost dropped past the event horizon of my TODO list.

        I modernized the patch, and was able to improve how effective its check is, by switching to comparing segment IDs (a very good check that the segments changed on disk) vs what the patch used to do, comparing maxDoc.

        Show
        mikemccand Michael McCandless added a comment - Woops, this almost dropped past the event horizon of my TODO list. I modernized the patch, and was able to improve how effective its check is, by switching to comparing segment IDs (a very good check that the segments changed on disk) vs what the patch used to do, comparing maxDoc.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 664e39292bd0a90ed6f20debc872ab74a1d7294f in lucene-solr's branch refs/heads/master from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=664e392 ]

        LUCENE-5931: detect when segments were (illegally) replaced when re-opening an IndexReader

        Show
        jira-bot ASF subversion and git services added a comment - Commit 664e39292bd0a90ed6f20debc872ab74a1d7294f in lucene-solr's branch refs/heads/master from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=664e392 ] LUCENE-5931 : detect when segments were (illegally) replaced when re-opening an IndexReader
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 6b0b119074f4cd32adc2388fbcc01f2aa70c7d5d in lucene-solr's branch refs/heads/branch_6x from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6b0b119 ]

        LUCENE-5931: detect when segments were (illegally) replaced when re-opening an IndexReader

        Show
        jira-bot ASF subversion and git services added a comment - Commit 6b0b119074f4cd32adc2388fbcc01f2aa70c7d5d in lucene-solr's branch refs/heads/branch_6x from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6b0b119 ] LUCENE-5931 : detect when segments were (illegally) replaced when re-opening an IndexReader
        Hide
        mikemccand Michael McCandless added a comment -

        Thanks Adrien Grand for pinging me about this almost lost issue!

        Show
        mikemccand Michael McCandless added a comment - Thanks Adrien Grand for pinging me about this almost lost issue!
        Hide
        mikemccand Michael McCandless added a comment -

        Bulk close resolved issues after 6.2.0 release.

        Show
        mikemccand Michael McCandless added a comment - Bulk close resolved issues after 6.2.0 release.

          People

          • Assignee:
            mikemccand Michael McCandless
            Reporter:
            vfunstein Vitaly Funstein
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development