Lucene - Core
  1. Lucene - Core
  2. LUCENE-2311

Pass potent SR to IRWarmer.warm(), and also call warm() for new segments

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.3, 3.0.2, 3.1, 4.0-ALPHA
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Currently warm() receives a SegmentReader without terms index and docstores.
      It would be arguably more useful for the app to receive a fully loaded reader, so it can actually fire up some caches. If the warmer is undefined on IW, we probably leave things as they are.

      It is also arguably more concise and clear to call warm() on all newly created segments, so there is a single point of warming readers in NRT context, and every subreader coming from getReader is guaranteed to be warmed up -> you don't have to introduce even more mess in your code by rechecking it.

      1. LUCENE-2311.patch
        3 kB
        Michael McCandless
      2. LUCENE-2311.patch
        5 kB
        Michael McCandless

        Activity

        Hide
        Earwin Burrfoot added a comment -

        Not only newly created, but newly opened too! So your readers are warmed, if you just created an IW, called getReader, and didn't do any writes at all yet.

        Show
        Earwin Burrfoot added a comment - Not only newly created, but newly opened too! So your readers are warmed, if you just created an IW, called getReader, and didn't do any writes at all yet.
        Hide
        Michael McCandless added a comment -

        Test case showing the issue.

        Basically you cannot run any searches against the reader passed to your warmer...

        I also had to fix LuceneTestCase – it was masking the original exception that CMS hit.

        Show
        Michael McCandless added a comment - Test case showing the issue. Basically you cannot run any searches against the reader passed to your warmer... I also had to fix LuceneTestCase – it was masking the original exception that CMS hit.
        Hide
        Michael McCandless added a comment -

        It is also arguably more concise and clear to call warm() on all newly created segments, so there is a single point of warming readers in NRT context, and every subreader coming from getReader is guaranteed to be warmed up -> you don't have to introduce even more mess in your code by rechecking it.

        I agree this would be nice, however... it's less important that this be called from within IW, because this warming time (on newly flushed segments) must be done in the foreground of the reopen thread. Ie, whether we do this in IW or app does it externally, this will block an NRT reader turnaround.

        Vs warming a newly merged segment, where we have the freedom to continue using the old (merged) segments until the newly merged one is warmed.

        Show
        Michael McCandless added a comment - It is also arguably more concise and clear to call warm() on all newly created segments, so there is a single point of warming readers in NRT context, and every subreader coming from getReader is guaranteed to be warmed up -> you don't have to introduce even more mess in your code by rechecking it. I agree this would be nice, however... it's less important that this be called from within IW, because this warming time (on newly flushed segments) must be done in the foreground of the reopen thread. Ie, whether we do this in IW or app does it externally, this will block an NRT reader turnaround. Vs warming a newly merged segment, where we have the freedom to continue using the old (merged) segments until the newly merged one is warmed.
        Hide
        Michael McCandless added a comment -

        New patch, includes trivial fix (test passes) and CHANGES.

        I plan to commit shortly.

        Show
        Michael McCandless added a comment - New patch, includes trivial fix (test passes) and CHANGES. I plan to commit shortly.
        Hide
        Earwin Burrfoot added a comment -

        This is not the issue of reader turnaround. This is the issue of code clarity.

        From the user's standpoint warming up a new-from-scratch segment and warming up just-merged segment is the same action. The difference of what happens in which thread, and what blocks/doesn't block reopen is totally Lucene-internal API-wise. Now you're forcing people to make this distinction and do absolutely identical warming-up action in two very different places. Moreover, when warming up a new reader from reopen(), they have to get new segments and then discern which ones of them were flushed, and which were merged, to avoid warming them up twice.

        I think there is no hurry to push the fix for this into 2.9.3. As of now this feature is broken. There is absolutely no way you can make it work. Thus, there are no applications that are suffering. Thus, not releasing the fix with 2.9.3, does not multiply suffering under the sun Thus, no need to hurry up and fix this at least somehow instead of doing this properly.

        Show
        Earwin Burrfoot added a comment - This is not the issue of reader turnaround. This is the issue of code clarity. From the user's standpoint warming up a new-from-scratch segment and warming up just-merged segment is the same action. The difference of what happens in which thread, and what blocks/doesn't block reopen is totally Lucene-internal API-wise. Now you're forcing people to make this distinction and do absolutely identical warming-up action in two very different places. Moreover, when warming up a new reader from reopen(), they have to get new segments and then discern which ones of them were flushed, and which were merged, to avoid warming them up twice. I think there is no hurry to push the fix for this into 2.9.3. As of now this feature is broken. There is absolutely no way you can make it work. Thus, there are no applications that are suffering. Thus, not releasing the fix with 2.9.3, does not multiply suffering under the sun Thus, no need to hurry up and fix this at least somehow instead of doing this properly.
        Hide
        Michael McCandless added a comment -

        From the user's standpoint warming up a new-from-scratch segment and warming up just-merged segment is the same action.

        OK, I agree... but, let's open a separate issue to rename
        setMergedSegmentWarmer -> setSegmentWarmer, and have it be invoked for
        flushed and merged segments?

        Does your pending patch (what's the issue number again?) do this already?

        I agree this change isn't needed for 2.9.3.

        I think there is no hurry to push the fix for this into 2.9.3.

        I think the first part of this bug (that you get an impotent
        SegmentReader passed to your segment warmer) should be fixed for
        2.9.3. The fix is trivial.

        Show
        Michael McCandless added a comment - From the user's standpoint warming up a new-from-scratch segment and warming up just-merged segment is the same action. OK, I agree... but, let's open a separate issue to rename setMergedSegmentWarmer -> setSegmentWarmer, and have it be invoked for flushed and merged segments? Does your pending patch (what's the issue number again?) do this already? I agree this change isn't needed for 2.9.3. I think there is no hurry to push the fix for this into 2.9.3. I think the first part of this bug (that you get an impotent SegmentReader passed to your segment warmer) should be fixed for 2.9.3. The fix is trivial.
        Hide
        Michael McCandless added a comment -

        OK I opened LUCENE-2485 for the 2nd part of this issue.

        I'll commit the first part shortly.

        Show
        Michael McCandless added a comment - OK I opened LUCENE-2485 for the 2nd part of this issue. I'll commit the first part shortly.
        Hide
        Earwin Burrfoot added a comment -

        Does your pending patch (what's the issue number again?) do this already?

        LUCENE-2355 - this patch doesn't do this yet.
        The next part removes the need for readerWarmer, as each reader has a number of components that are notified when reader is created/closed (and can warm themselves appropriately).
        This also takes care of one of Yonik's concerns from LUCENE-2485

        Passing in the complete index (in addition to just the new segment) would allow incremental updating of an index-wide data structure

        The factories that create components are shared for DirReader+SRs or IW-readerPool+SRs+IWBackedReader, so new components by default have access to index-wide context.

        The part that is missing is the way for the user to specify if he wants his newly merged SRs pre-warmed and up to which runlevel.

        Show
        Earwin Burrfoot added a comment - Does your pending patch (what's the issue number again?) do this already? LUCENE-2355 - this patch doesn't do this yet. The next part removes the need for readerWarmer, as each reader has a number of components that are notified when reader is created/closed (and can warm themselves appropriately). This also takes care of one of Yonik's concerns from LUCENE-2485 Passing in the complete index (in addition to just the new segment) would allow incremental updating of an index-wide data structure The factories that create components are shared for DirReader+SRs or IW-readerPool+SRs+IWBackedReader, so new components by default have access to index-wide context. The part that is missing is the way for the user to specify if he wants his newly merged SRs pre-warmed and up to which runlevel.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Earwin Burrfoot
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development