Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4
    • Component/s: None
    • Labels:
      None

      Description

      Take advantage of IndexReader.reopen(): LUCENE-743

      1. SOLR-374.patch
        2 kB
        Mark Miller
      2. SOLR-374.patch
        2 kB
        Mark Miller
      3. SOLR-374.patch
        3 kB
        Mark Miller
      4. SOLR-374.patch
        4 kB
        Mark Miller
      5. SOLR-374.patch
        4 kB
        Mark Miller

        Activity

        Hide
        Mark Miller added a comment -

        I may be missing something but this one is pretty simple right?

        The biggest issue I see is that some tests rely on getting a new Reader reference whether its needed or not (the index hasn't changed) after a commit. So while I'd like to just return the searcher when the Reader hasnt changed, those tests would have to be changed. As is, a small change should actually be cheaper than no change I think.

        Show
        Mark Miller added a comment - I may be missing something but this one is pretty simple right? The biggest issue I see is that some tests rely on getting a new Reader reference whether its needed or not (the index hasn't changed) after a commit. So while I'd like to just return the searcher when the Reader hasnt changed, those tests would have to be changed. As is, a small change should actually be cheaper than no change I think.
        Hide
        Yonik Seeley added a comment -

        It would not have been easy in the past, but with all of the recent changes, it should be simple.
        This patch has couple off issues though:

        • a race condition: the reader could be closed between the time you get it and the time you try to call reopen().
        • descriptor leak: you pass closeReader=false, but no one else will close this reader.
        • the last reader to be opened is the one that should be re-opened, not necessarily the currently registered one

        See the getNewestSearcher() method I recently added to fix both #1 and #3
        Also, I think that any test that expects the reader to be different should be changed.

        Show
        Yonik Seeley added a comment - It would not have been easy in the past, but with all of the recent changes, it should be simple. This patch has couple off issues though: a race condition: the reader could be closed between the time you get it and the time you try to call reopen(). descriptor leak: you pass closeReader=false, but no one else will close this reader. the last reader to be opened is the one that should be re-opened, not necessarily the currently registered one See the getNewestSearcher() method I recently added to fix both #1 and #3 Also, I think that any test that expects the reader to be different should be changed.
        Hide
        Mark Miller added a comment -
        • a race condition: the reader could be closed between the time you get it and the time you try to call reopen().

        Ah, because of no incref...

        • descriptor leak: you pass closeReader=false, but no one else will close this reader.

        Dumb mistake here - made a private method public just so I could pass true and then still passed false...

        >>Also, I think that any test that expects the reader to be different should be changed.
        Alright, easy enough, just two I think: elevation and function tests, using the reader as a key in a map or something.

        If thats all for the reopen, I've got that looking good I think, just have to take care of the tests.

        Show
        Mark Miller added a comment - a race condition: the reader could be closed between the time you get it and the time you try to call reopen(). Ah, because of no incref... descriptor leak: you pass closeReader=false, but no one else will close this reader. Dumb mistake here - made a private method public just so I could pass true and then still passed false... >>Also, I think that any test that expects the reader to be different should be changed. Alright, easy enough, just two I think: elevation and function tests, using the reader as a key in a map or something. If thats all for the reopen, I've got that looking good I think, just have to take care of the tests.
        Hide
        Mark Miller added a comment -

        Hmmm...looks like I was wrong about those tests failing just because of the same Reader - looked that way, and the expected fix worked, but doing things correctly as directed by yonik, now all the tests pass no problem.

        Show
        Mark Miller added a comment - Hmmm...looks like I was wrong about those tests failing just because of the same Reader - looked that way, and the expected fix worked, but doing things correctly as directed by yonik, now all the tests pass no problem.
        Hide
        Yonik Seeley added a comment -

        You've involved yourself in one of the more complicated methods in Solr

        • Latest patch has a new race condition: _searcher.incref() may be called after a final _searcher.decref() closes the searcher/reader.
        • we shouldn't need to check if _searcher==null or not... there may be searchers open that have not yet been registered.
        • if the reader from the newest searcher is equal to it's reopen, you return the registered searcher (which may actually be different from the newest searcher)
        • returning a RefCounted<SolrIndexSearcher> immediately can expose it before it was supposed to be used (before warming has completed, etc)
        • returning a RefCounted<SolrIndexSearcher> is not always the right thing to do - it depends on the parameters to the function.

        There are really two different optimizations here:
        1) call IndexReader.reopen() and share parts of the most recently opened IndexReader
        2) if the IndexReader didn't change, avoid going through warming, autowarming, etc and just reuse the same searcher

        Show
        Yonik Seeley added a comment - You've involved yourself in one of the more complicated methods in Solr Latest patch has a new race condition: _searcher.incref() may be called after a final _searcher.decref() closes the searcher/reader. we shouldn't need to check if _searcher==null or not... there may be searchers open that have not yet been registered. if the reader from the newest searcher is equal to it's reopen, you return the registered searcher (which may actually be different from the newest searcher) returning a RefCounted<SolrIndexSearcher> immediately can expose it before it was supposed to be used (before warming has completed, etc) returning a RefCounted<SolrIndexSearcher> is not always the right thing to do - it depends on the parameters to the function. There are really two different optimizations here: 1) call IndexReader.reopen() and share parts of the most recently opened IndexReader 2) if the IndexReader didn't change, avoid going through warming, autowarming, etc and just reuse the same searcher
        Hide
        Mark Miller added a comment -
        • Latest patch has a new race condition: _searcher.incref() may be called after a final _searcher.decref() closes the searcher/reader.
          Right...since I shouldn't even be returning _searcher, that goes away I think
        • we shouldn't need to check if _searcher==null or not... there may be searchers open that have not yet been registered.
          Right....gone.
        • if the reader from the newest searcher is equal to it's reopen, you return the registered searcher (which may actually be different from the newest searcher)
          Right....gone.
        • returning a RefCounted<SolrIndexSearcher> immediately can expose it before it was supposed to be used (before warming has completed, etc)
          Right....gone.
        • returning a RefCounted<SolrIndexSearcher> is not always the right thing to do - it depends on the parameters to the function.
          Good point

        So I guess the key on this patch (as you pointed out) is that it is two optimizations, and the not doing anything if the Reader hasn't changed optimization really makes things more difficult - dropping it for now, I think solves pretty much each of these issues.

        I was right about the Reader and the tests as well...things passed because they were still wrong - so I have adjusted the two tests to actually change the index instead of just commit.

        I think this does just the reopen correctly but I am still scanning and checking. I definitely missed were the first sync on the search lock was closing in the earlier patch...soo many braces.

        Show
        Mark Miller added a comment - Latest patch has a new race condition: _searcher.incref() may be called after a final _searcher.decref() closes the searcher/reader. Right...since I shouldn't even be returning _searcher, that goes away I think we shouldn't need to check if _searcher==null or not... there may be searchers open that have not yet been registered. Right....gone. if the reader from the newest searcher is equal to it's reopen, you return the registered searcher (which may actually be different from the newest searcher) Right....gone. returning a RefCounted<SolrIndexSearcher> immediately can expose it before it was supposed to be used (before warming has completed, etc) Right....gone. returning a RefCounted<SolrIndexSearcher> is not always the right thing to do - it depends on the parameters to the function. Good point So I guess the key on this patch (as you pointed out) is that it is two optimizations, and the not doing anything if the Reader hasn't changed optimization really makes things more difficult - dropping it for now, I think solves pretty much each of these issues. I was right about the Reader and the tests as well...things passed because they were still wrong - so I have adjusted the two tests to actually change the index instead of just commit. I think this does just the reopen correctly but I am still scanning and checking. I definitely missed were the first sync on the search lock was closing in the earlier patch...soo many braces.
        Hide
        Yonik Seeley added a comment -

        Looking pretty good now... but there is a reference leak. decref() should always be called for any RefCounted instances (preferably in a finally block)

        Show
        Yonik Seeley added a comment - Looking pretty good now... but there is a reference leak. decref() should always be called for any RefCounted instances (preferably in a finally block)
        Hide
        Mark Miller added a comment -

        Okay. Not sure how kosher taking ownership of the Reader form SolrIndexSearcher is, but it seems the thing to do then.

        Show
        Mark Miller added a comment - Okay. Not sure how kosher taking ownership of the Reader form SolrIndexSearcher is, but it seems the thing to do then.
        Hide
        Mark Miller added a comment - - edited

        Hmmm...probably need a searcher lock around taking reader ownership....

        or not...the incref will keep it from being closed. NM.

        Show
        Mark Miller added a comment - - edited Hmmm...probably need a searcher lock around taking reader ownership.... or not...the incref will keep it from being closed. NM.
        Hide
        Yonik Seeley added a comment -

        or not...the incref will keep it from being closed. NM.

        Right. I think all you need to add is a decref()

        Show
        Yonik Seeley added a comment - or not...the incref will keep it from being closed. NM. Right. I think all you need to add is a decref()
        Hide
        Mark Miller added a comment -

        I've got the decref on the newestSearcher in a finally block there - miss it? or did I botch it?

        Show
        Mark Miller added a comment - I've got the decref on the newestSearcher in a finally block there - miss it? or did I botch it?
        Hide
        Yonik Seeley added a comment -

        I missed the last patch (I wish JIRA defaulted to "All").

        It seems like that if reopen() returns us the same reader, we should just incRef it... (or is that in a slightly later version of Lucene?)
        Trying to steal the reader instead seems hard to get right (seems like another thread could try to open another searcher, but our searcher doesn't have it and neither does the old one, so your exception might be triggered.)

        Show
        Yonik Seeley added a comment - I missed the last patch (I wish JIRA defaulted to "All"). It seems like that if reopen() returns us the same reader, we should just incRef it... (or is that in a slightly later version of Lucene?) Trying to steal the reader instead seems hard to get right (seems like another thread could try to open another searcher, but our searcher doesn't have it and neither does the old one, so your exception might be triggered.)
        Hide
        Mark Miller added a comment -

        Man...nothing is ever simple A search lock around the ownership change would solve that right? The incref on the Reader is way cleaner though - from what I can tell solr Lucene is a bit too old though. Worth it to wait I think - much better than a sync.

        Show
        Mark Miller added a comment - Man...nothing is ever simple A search lock around the ownership change would solve that right? The incref on the Reader is way cleaner though - from what I can tell solr Lucene is a bit too old though. Worth it to wait I think - much better than a sync.
        Hide
        Mark Miller added a comment -

        I'm still firming up my knowledge of this class, but I think this is right. Just switched to the incref rather than Reader ownership change.

        Show
        Mark Miller added a comment - I'm still firming up my knowledge of this class, but I think this is right. Just switched to the incref rather than Reader ownership change.
        Hide
        Yonik Seeley added a comment -

        Committed. Thanks Mark!

        This did cause a test failure on windows, but it's the test that needs fixing: SOLR-775

        Show
        Yonik Seeley added a comment - Committed. Thanks Mark! This did cause a test failure on windows, but it's the test that needs fixing: SOLR-775

          People

          • Assignee:
            Unassigned
            Reporter:
            Yonik Seeley
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development