Lucene - Core
  1. Lucene - Core
  2. LUCENE-2311

Pass potent SR to IRWarmer.warm(), and also call warm() for new segments

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.3, 3.0.2, 3.1, 4.0-ALPHA
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Currently warm() receives a SegmentReader without terms index and docstores.
      It would be arguably more useful for the app to receive a fully loaded reader, so it can actually fire up some caches. If the warmer is undefined on IW, we probably leave things as they are.

      It is also arguably more concise and clear to call warm() on all newly created segments, so there is a single point of warming readers in NRT context, and every subreader coming from getReader is guaranteed to be warmed up -> you don't have to introduce even more mess in your code by rechecking it.

      1. LUCENE-2311.patch
        3 kB
        Michael McCandless
      2. LUCENE-2311.patch
        5 kB
        Michael McCandless

        Activity

        Earwin Burrfoot created issue -
        Michael McCandless made changes -
        Field Original Value New Value
        Fix Version/s 3.1 [ 12314025 ]
        Hide
        Earwin Burrfoot added a comment -

        Not only newly created, but newly opened too! So your readers are warmed, if you just created an IW, called getReader, and didn't do any writes at all yet.

        Show
        Earwin Burrfoot added a comment - Not only newly created, but newly opened too! So your readers are warmed, if you just created an IW, called getReader, and didn't do any writes at all yet.
        Michael McCandless made changes -
        Assignee Michael McCandless [ mikemccand ]
        Robert Muir made changes -
        Component/s Index [ 12310232 ]
        Michael McCandless made changes -
        Fix Version/s 2.9.3 [ 12314799 ]
        Fix Version/s 3.0.2 [ 12314798 ]
        Fix Version/s 3.1 [ 12314822 ]
        Hide
        Michael McCandless added a comment -

        Test case showing the issue.

        Basically you cannot run any searches against the reader passed to your warmer...

        I also had to fix LuceneTestCase – it was masking the original exception that CMS hit.

        Show
        Michael McCandless added a comment - Test case showing the issue. Basically you cannot run any searches against the reader passed to your warmer... I also had to fix LuceneTestCase – it was masking the original exception that CMS hit.
        Michael McCandless made changes -
        Attachment LUCENE-2311.patch [ 12446059 ]
        Hide
        Michael McCandless added a comment -

        It is also arguably more concise and clear to call warm() on all newly created segments, so there is a single point of warming readers in NRT context, and every subreader coming from getReader is guaranteed to be warmed up -> you don't have to introduce even more mess in your code by rechecking it.

        I agree this would be nice, however... it's less important that this be called from within IW, because this warming time (on newly flushed segments) must be done in the foreground of the reopen thread. Ie, whether we do this in IW or app does it externally, this will block an NRT reader turnaround.

        Vs warming a newly merged segment, where we have the freedom to continue using the old (merged) segments until the newly merged one is warmed.

        Show
        Michael McCandless added a comment - It is also arguably more concise and clear to call warm() on all newly created segments, so there is a single point of warming readers in NRT context, and every subreader coming from getReader is guaranteed to be warmed up -> you don't have to introduce even more mess in your code by rechecking it. I agree this would be nice, however... it's less important that this be called from within IW, because this warming time (on newly flushed segments) must be done in the foreground of the reopen thread. Ie, whether we do this in IW or app does it externally, this will block an NRT reader turnaround. Vs warming a newly merged segment, where we have the freedom to continue using the old (merged) segments until the newly merged one is warmed.
        Hide
        Michael McCandless added a comment -

        New patch, includes trivial fix (test passes) and CHANGES.

        I plan to commit shortly.

        Show
        Michael McCandless added a comment - New patch, includes trivial fix (test passes) and CHANGES. I plan to commit shortly.
        Michael McCandless made changes -
        Attachment LUCENE-2311.patch [ 12446060 ]
        Hide
        Earwin Burrfoot added a comment -

        This is not the issue of reader turnaround. This is the issue of code clarity.

        From the user's standpoint warming up a new-from-scratch segment and warming up just-merged segment is the same action. The difference of what happens in which thread, and what blocks/doesn't block reopen is totally Lucene-internal API-wise. Now you're forcing people to make this distinction and do absolutely identical warming-up action in two very different places. Moreover, when warming up a new reader from reopen(), they have to get new segments and then discern which ones of them were flushed, and which were merged, to avoid warming them up twice.

        I think there is no hurry to push the fix for this into 2.9.3. As of now this feature is broken. There is absolutely no way you can make it work. Thus, there are no applications that are suffering. Thus, not releasing the fix with 2.9.3, does not multiply suffering under the sun Thus, no need to hurry up and fix this at least somehow instead of doing this properly.

        Show
        Earwin Burrfoot added a comment - This is not the issue of reader turnaround. This is the issue of code clarity. From the user's standpoint warming up a new-from-scratch segment and warming up just-merged segment is the same action. The difference of what happens in which thread, and what blocks/doesn't block reopen is totally Lucene-internal API-wise. Now you're forcing people to make this distinction and do absolutely identical warming-up action in two very different places. Moreover, when warming up a new reader from reopen(), they have to get new segments and then discern which ones of them were flushed, and which were merged, to avoid warming them up twice. I think there is no hurry to push the fix for this into 2.9.3. As of now this feature is broken. There is absolutely no way you can make it work. Thus, there are no applications that are suffering. Thus, not releasing the fix with 2.9.3, does not multiply suffering under the sun Thus, no need to hurry up and fix this at least somehow instead of doing this properly.
        Hide
        Michael McCandless added a comment -

        From the user's standpoint warming up a new-from-scratch segment and warming up just-merged segment is the same action.

        OK, I agree... but, let's open a separate issue to rename
        setMergedSegmentWarmer -> setSegmentWarmer, and have it be invoked for
        flushed and merged segments?

        Does your pending patch (what's the issue number again?) do this already?

        I agree this change isn't needed for 2.9.3.

        I think there is no hurry to push the fix for this into 2.9.3.

        I think the first part of this bug (that you get an impotent
        SegmentReader passed to your segment warmer) should be fixed for
        2.9.3. The fix is trivial.

        Show
        Michael McCandless added a comment - From the user's standpoint warming up a new-from-scratch segment and warming up just-merged segment is the same action. OK, I agree... but, let's open a separate issue to rename setMergedSegmentWarmer -> setSegmentWarmer, and have it be invoked for flushed and merged segments? Does your pending patch (what's the issue number again?) do this already? I agree this change isn't needed for 2.9.3. I think there is no hurry to push the fix for this into 2.9.3. I think the first part of this bug (that you get an impotent SegmentReader passed to your segment warmer) should be fixed for 2.9.3. The fix is trivial.
        Hide
        Michael McCandless added a comment -

        OK I opened LUCENE-2485 for the 2nd part of this issue.

        I'll commit the first part shortly.

        Show
        Michael McCandless added a comment - OK I opened LUCENE-2485 for the 2nd part of this issue. I'll commit the first part shortly.
        Hide
        Earwin Burrfoot added a comment -

        Does your pending patch (what's the issue number again?) do this already?

        LUCENE-2355 - this patch doesn't do this yet.
        The next part removes the need for readerWarmer, as each reader has a number of components that are notified when reader is created/closed (and can warm themselves appropriately).
        This also takes care of one of Yonik's concerns from LUCENE-2485

        Passing in the complete index (in addition to just the new segment) would allow incremental updating of an index-wide data structure

        The factories that create components are shared for DirReader+SRs or IW-readerPool+SRs+IWBackedReader, so new components by default have access to index-wide context.

        The part that is missing is the way for the user to specify if he wants his newly merged SRs pre-warmed and up to which runlevel.

        Show
        Earwin Burrfoot added a comment - Does your pending patch (what's the issue number again?) do this already? LUCENE-2355 - this patch doesn't do this yet. The next part removes the need for readerWarmer, as each reader has a number of components that are notified when reader is created/closed (and can warm themselves appropriately). This also takes care of one of Yonik's concerns from LUCENE-2485 Passing in the complete index (in addition to just the new segment) would allow incremental updating of an index-wide data structure The factories that create components are shared for DirReader+SRs or IW-readerPool+SRs+IWBackedReader, so new components by default have access to index-wide context. The part that is missing is the way for the user to specify if he wants his newly merged SRs pre-warmed and up to which runlevel.
        Michael McCandless made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Uwe Schindler made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Mark Thomas made changes -
        Workflow jira [ 12501563 ] Default workflow, editable Closed status [ 12564245 ]
        Mark Thomas made changes -
        Workflow Default workflow, editable Closed status [ 12564245 ] jira [ 12584036 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        82d 19h 4m 1 Michael McCandless 02/Jun/10 18:23
        Resolved Resolved Closed Closed
        15d 14h 39m 1 Uwe Schindler 18/Jun/10 09:03

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Earwin Burrfoot
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development