Solr
  1. Solr
  2. SOLR-4764

When using NRT, just init the reader from IndexWriter

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.5, 5.0
    • Component/s: None
    • Labels:
      None

      Description

      Spinoff from SOLR-4761

      Solr first opens a DirectoryReader from the directory, then later will pass this to IW openIfChanged.

      I noticed this when i was confused that mergedsegmentwarmer doesn't appear to work at first until after you've reopened...

      I'm not totally sure what the current behavior causes (does IW's pool reuse segments from this passed-in "external" reader, or is this causing some horrible doubling-up/inefficient stuff etc?). To some extent i think we should change it even if its actually performant: I think its confusing.

      I think ideally we'd change IndexReaderFactory's method to take writer instead of directory so that custom DirectoryReaders can still work.

      1. SOLR-4764.patch
        14 kB
        Robert Muir
      2. SOLR-4764_test.patch
        3 kB
        Robert Muir

        Issue Links

          Activity

          Hide
          Adrien Grand added a comment -

          4.5 release -> bulk close

          Show
          Adrien Grand added a comment - 4.5 release -> bulk close
          Hide
          Robert Muir added a comment -

          Thanks for reviewing Mark!

          Show
          Robert Muir added a comment - Thanks for reviewing Mark!
          Hide
          ASF subversion and git services added a comment -

          Commit 1513953 from Robert Muir in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1513953 ]

          SOLR-4764: When using NRT, just init the reader from IndexWriter

          Show
          ASF subversion and git services added a comment - Commit 1513953 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1513953 ] SOLR-4764 : When using NRT, just init the reader from IndexWriter
          Hide
          ASF subversion and git services added a comment -

          Commit 1513945 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1513945 ]

          SOLR-4764: When using NRT, just init the reader from IndexWriter

          Show
          ASF subversion and git services added a comment - Commit 1513945 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1513945 ] SOLR-4764 : When using NRT, just init the reader from IndexWriter
          Hide
          Mark Miller added a comment -

          +1 - I think this is a good way to solve this. It's minimally invasive given all the other current code and it allows you still lazy load the IW if you don't reopen readers.

          I've closely reviewed the patch and run the units tests - it all looks okay to me.

          Show
          Mark Miller added a comment - +1 - I think this is a good way to solve this. It's minimally invasive given all the other current code and it allows you still lazy load the IW if you don't reopen readers. I've closely reviewed the patch and run the units tests - it all looks okay to me.
          Hide
          Robert Muir added a comment -

          Here is a patch. All tests pass.

          A few notes:

          • the init respects the existing 'reopenReaders' in solrconfig.xml, so if you have that set to false, it still does the same thing as before.
          • IndexReaderFactory gets a newReader(IndexWriter, ..) so that NRT users can still customize the DirectoryReader being used.
          • A few tests relied upon the fact that init didnt open an indexwriter: MultiCoreExampleTestBase always created a core container and opened cores, so this was moved to the embedded subclass which needs it. The ReplicationHandler test and the TestArbitraryIndexDir just set reopenReaders=false because they rely fundamentally on this 'readonly' case.
          Show
          Robert Muir added a comment - Here is a patch. All tests pass. A few notes: the init respects the existing 'reopenReaders' in solrconfig.xml, so if you have that set to false, it still does the same thing as before. IndexReaderFactory gets a newReader(IndexWriter, ..) so that NRT users can still customize the DirectoryReader being used. A few tests relied upon the fact that init didnt open an indexwriter: MultiCoreExampleTestBase always created a core container and opened cores, so this was moved to the embedded subclass which needs it. The ReplicationHandler test and the TestArbitraryIndexDir just set reopenReaders=false because they rely fundamentally on this 'readonly' case.
          Hide
          Robert Muir added a comment -

          here's a test.

          Show
          Robert Muir added a comment - here's a test.
          Hide
          Michael Garski added a comment -

          I discovered the same thing, and that there is no re-use of segment caches on a read-only replication slave as well. I opened SOLR-4909 to handle the slave replication case. I'm not sure that changing the IndexReaderFactory's method will work with a read-only slave, but adding another method for NRT/replication masters that take a writer might work.

          Show
          Michael Garski added a comment - I discovered the same thing, and that there is no re-use of segment caches on a read-only replication slave as well. I opened SOLR-4909 to handle the slave replication case. I'm not sure that changing the IndexReaderFactory's method will work with a read-only slave, but adding another method for NRT/replication masters that take a writer might work.
          Hide
          Michael McCandless added a comment -

          +1, this is very costly, because the first NRT open will open an entirely new set of SegmentReaders (not sharing anything from the non-NRT reader passed in to openIfChanged).

          Show
          Michael McCandless added a comment - +1, this is very costly, because the first NRT open will open an entirely new set of SegmentReaders (not sharing anything from the non-NRT reader passed in to openIfChanged).

            People

            • Assignee:
              Unassigned
              Reporter:
              Robert Muir
            • Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development