Lucene - Core
  1. Lucene - Core
  2. LUCENE-3486

Add SearcherLifetimeManager, so you can retrieve the same searcher you previously used

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.5, 4.0-ALPHA
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The idea is similar to SOLR-2809 (adding searcher leases to Solr).

      This utility class sits above whatever your source is for "the
      current" searcher (eg NRTManager, SearcherManager, etc.), and records
      (holds a reference to) each searcher in recent history.

      The idea is to ensure that when a user does a follow-on action (clicks
      next page, drills down/up), or when two or more searcher invocations
      within a single user search need to happen against the same searcher
      (eg in distributed search), you can retrieve the same searcher you
      used "last time".

      I think with the new searchAfter API (LUCENE-2215), doing follow-on
      searches on the same searcher is more important, since the "bottom"
      (score/docID) held for that API can easily shift when a new searcher
      is opened.

      When you do a "new" search, you record the searcher you used with the
      manager, and it returns to you a long token (currently just the
      IR.getVersion()), which you can later use to retrieve the same
      searcher.

      Separately you must periodically call prune(), to prune the old
      searchers, ideally from the same thread / at the same time that
      you open a new searcher.

      1. LUCENE-3486.patch
        15 kB
        Michael McCandless
      2. LUCENE-3486.patch
        16 kB
        Michael McCandless
      3. LUCENE-3486.patch
        14 kB
        Michael McCandless

        Activity

        Hide
        Uwe Schindler added a comment -

        Bulk close after release of 3.5

        Show
        Uwe Schindler added a comment - Bulk close after release of 3.5
        Hide
        Michael McCandless added a comment -

        Thanks Uwe, you're right – new patch, to just use IOUtils. I also make an effort to catch mis-use (record called while close is running).

        Show
        Michael McCandless added a comment - Thanks Uwe, you're right – new patch, to just use IOUtils. I also make an effort to catch mis-use (record called while close is running).
        Hide
        Uwe Schindler added a comment -

        Mike: Why do we need the duplicate of IOUtils in SearcherLifetimeManager.close()? You can also use IOUtils.closeSafely(Collection)?

        Show
        Uwe Schindler added a comment - Mike: Why do we need the duplicate of IOUtils in SearcherLifetimeManager.close()? You can also use IOUtils.closeSafely(Collection)?
        Hide
        Uwe Schindler added a comment -

        Jason: Java 5 does not allow @Override on interfaces!

        Show
        Uwe Schindler added a comment - Jason: Java 5 does not allow @Override on interfaces!
        Hide
        Jason Rutherglen added a comment -

        Looks good. I noticed you marked close() with @Override. Are we on Java 6 in 3.x?

        @Override is all over the place in Solr!?

        Show
        Jason Rutherglen added a comment - Looks good. I noticed you marked close() with @Override. Are we on Java 6 in 3.x? @Override is all over the place in Solr!?
        Hide
        Michael McCandless added a comment -

        I noticed you marked close() with @Override. Are we on Java 6 in 3.x?

        Sigh, no we are not; in 3.x I'll have to comment that out.

        Show
        Michael McCandless added a comment - I noticed you marked close() with @Override. Are we on Java 6 in 3.x? Sigh, no we are not; in 3.x I'll have to comment that out.
        Hide
        Shai Erera added a comment -

        Looks good. I noticed you marked close() with @Override. Are we on Java 6 in 3.x?

        Show
        Shai Erera added a comment - Looks good. I noticed you marked close() with @Override. Are we on Java 6 in 3.x?
        Hide
        Michael McCandless added a comment -

        Patch w/ Shai's suggestions, and tweaked some jdocs.

        Show
        Michael McCandless added a comment - Patch w/ Shai's suggestions, and tweaked some jdocs.
        Hide
        Michael McCandless added a comment -

        Perhaps instead of "recordTimeSec = System.nanoTime()/1000000000.0;" you can use TimeUnit.NANOS.toSeconds? Just for clarity, and get rid of this monstrous number .

        I was wanting to have the seconds be a double, but, I agree that's
        overkill; I think a "typical" pruning time should be maybe 10 minutes
        and so having "int seconds" is OK. I'll change it and use
        TimeUnit.NANOSECONDS.toSeconds.

        Typo: "such that if the use performs": 'use' --> 'user'

        Thanks I'll fix.

        Would it be better if SearcherTracker has a close() method and we call it instead of decRef()-ing on our own?

        I agree – I'll fix.

        About this "// nocommit – maybe make it 'public' that you just decRef?" --> do you mean whether we should jdoc that that's all we're doing? If so, why commit to just that? I don't think it contributes to the user ...

        Well... the release is sort of "spooky" in that you can freely call it after close, which is why I thought about making its impl public; but I agree, let's leave it private and just keep the NOTE that it's fine to call after close.

        I have a problem with the Pruner interface. It has a single method prune which takes an IndexSearcher and ageSec (BTW, why is it double and not long?). And there's PruneByAge impl. But, what other impls could there be for this interface, if not by age?
        On the other hand, I can certainly see someone, perhaps w/ NRT, not wanting to keep too many searchers around, and instead of committing to an age, he'll want to use a hard number (like, the newest 5 searchers) - that interface makes it impossible to impl.
        If you think however that pruning by age, is the only scenario that makes sense, then I suggest removing the interface and having just the impl. Otherwise, perhaps a different interface should be created, one that receives a list of searchers, with their age, and returns a list of searchers that should be released? Just an idea.

        Hmm, now that I read prune(Pruner) jdoc, I can see how someone could impl "newest 5 searchers" by just counting up to 5 in its doPrune() calls, because prune(Pruner) guarantees that the searchers are passed newest to oldest. But still I wonder if the interface is not too limited.

        Right, my idea was to make it really easy to prune-by-age, since
        that's the common case, but also make it possible to do more
        expert policies.

        I think prune-by-count is possible, but maybe more interesting would
        be prune-by-total-segments-size, ie, if a large merge commits, this
        metric would cut back on the number of searchers so that the net RAM
        tied up is lower. Not sure this is really needed in practice as large
        merges don't complete often and it's unlikely you'd hit more than one in
        your time window...

        Thanks Shai!

        Show
        Michael McCandless added a comment - Perhaps instead of "recordTimeSec = System.nanoTime()/1000000000.0;" you can use TimeUnit.NANOS.toSeconds? Just for clarity, and get rid of this monstrous number . I was wanting to have the seconds be a double, but, I agree that's overkill; I think a "typical" pruning time should be maybe 10 minutes and so having "int seconds" is OK. I'll change it and use TimeUnit.NANOSECONDS.toSeconds. Typo: "such that if the use performs": 'use' --> 'user' Thanks I'll fix. Would it be better if SearcherTracker has a close() method and we call it instead of decRef()-ing on our own? I agree – I'll fix. About this "// nocommit – maybe make it 'public' that you just decRef?" --> do you mean whether we should jdoc that that's all we're doing? If so, why commit to just that? I don't think it contributes to the user ... Well... the release is sort of "spooky" in that you can freely call it after close, which is why I thought about making its impl public; but I agree, let's leave it private and just keep the NOTE that it's fine to call after close. I have a problem with the Pruner interface. It has a single method prune which takes an IndexSearcher and ageSec (BTW, why is it double and not long?). And there's PruneByAge impl. But, what other impls could there be for this interface, if not by age? On the other hand, I can certainly see someone, perhaps w/ NRT, not wanting to keep too many searchers around, and instead of committing to an age, he'll want to use a hard number (like, the newest 5 searchers) - that interface makes it impossible to impl. If you think however that pruning by age, is the only scenario that makes sense, then I suggest removing the interface and having just the impl. Otherwise, perhaps a different interface should be created, one that receives a list of searchers, with their age, and returns a list of searchers that should be released? Just an idea. Hmm, now that I read prune(Pruner) jdoc, I can see how someone could impl "newest 5 searchers" by just counting up to 5 in its doPrune() calls, because prune(Pruner) guarantees that the searchers are passed newest to oldest. But still I wonder if the interface is not too limited. Right, my idea was to make it really easy to prune-by-age, since that's the common case, but also make it possible to do more expert policies. I think prune-by-count is possible, but maybe more interesting would be prune-by-total-segments-size, ie, if a large merge commits, this metric would cut back on the number of searchers so that the net RAM tied up is lower. Not sure this is really needed in practice as large merges don't complete often and it's unlikely you'd hit more than one in your time window... Thanks Shai!
        Hide
        Shai Erera added a comment -

        What a cool object, Mike ! And the javadocs are very good too.

        • Perhaps instead of "recordTimeSec = System.nanoTime()/1000000000.0;" you can use TimeUnit.NANOS.toSeconds? Just for clarity, and get rid of this monstrous number .
        • Typo: "such that if the use performs": 'use' --> 'user'
        • About this code:
          +    if (tracker == null) {
          +      tracker = new SearcherTracker(searcher);
          +      if (searchers.putIfAbsent(version, tracker) != null) {
          +        // Another thread beat us -- must decRef to undo
          +        // incRef done by SearcherTracker ctor:
          +        searcher.getIndexReader().decRef();
          +      }
          

          Would it be better if SearcherTracker has a close() method and we call it instead of decRef()-ing on our own? Seems cleaner to me, and I always like to see the code that incRef/new closer to decRef/close. And if tomorrow SearcherTracker needs to clear other things too, we're already covered.

        • About this "// nocommit – maybe make it 'public' that you just decRef?" --> do you mean whether we should jdoc that that's all we're doing? If so, why commit to just that? I don't think it contributes to the user ...
        • I have a problem with the Pruner interface. It has a single method prune which takes an IndexSearcher and ageSec (BTW, why is it double and not long?). And there's PruneByAge impl. But, what other impls could there be for this interface, if not by age?
          • On the other hand, I can certainly see someone, perhaps w/ NRT, not wanting to keep too many searchers around, and instead of committing to an age, he'll want to use a hard number (like, the newest 5 searchers) - that interface makes it impossible to impl.

        If you think however that pruning by age, is the only scenario that makes sense, then I suggest removing the interface and having just the impl. Otherwise, perhaps a different interface should be created, one that receives a list of searchers, with their age, and returns a list of searchers that should be released? Just an idea.

        Hmm, now that I read prune(Pruner) jdoc, I can see how someone could impl "newest 5 searchers" by just counting up to 5 in its doPrune() calls, because prune(Pruner) guarantees that the searchers are passed newest to oldest. But still I wonder if the interface is not too limited.

        Looks very good !

        Show
        Shai Erera added a comment - What a cool object, Mike ! And the javadocs are very good too. Perhaps instead of "recordTimeSec = System.nanoTime()/1000000000.0;" you can use TimeUnit.NANOS.toSeconds? Just for clarity, and get rid of this monstrous number . Typo: "such that if the use performs": 'use' --> 'user' About this code: + if (tracker == null ) { + tracker = new SearcherTracker(searcher); + if (searchers.putIfAbsent(version, tracker) != null ) { + // Another thread beat us -- must decRef to undo + // incRef done by SearcherTracker ctor: + searcher.getIndexReader().decRef(); + } Would it be better if SearcherTracker has a close() method and we call it instead of decRef()-ing on our own? Seems cleaner to me, and I always like to see the code that incRef/new closer to decRef/close. And if tomorrow SearcherTracker needs to clear other things too, we're already covered. About this "// nocommit – maybe make it 'public' that you just decRef?" --> do you mean whether we should jdoc that that's all we're doing? If so, why commit to just that? I don't think it contributes to the user ... I have a problem with the Pruner interface. It has a single method prune which takes an IndexSearcher and ageSec (BTW, why is it double and not long?). And there's PruneByAge impl. But, what other impls could there be for this interface, if not by age? On the other hand, I can certainly see someone, perhaps w/ NRT, not wanting to keep too many searchers around, and instead of committing to an age, he'll want to use a hard number (like, the newest 5 searchers) - that interface makes it impossible to impl. If you think however that pruning by age, is the only scenario that makes sense, then I suggest removing the interface and having just the impl. Otherwise, perhaps a different interface should be created, one that receives a list of searchers, with their age, and returns a list of searchers that should be released? Just an idea. Hmm, now that I read prune(Pruner) jdoc, I can see how someone could impl "newest 5 searchers" by just counting up to 5 in its doPrune() calls, because prune(Pruner) guarantees that the searchers are passed newest to oldest. But still I wonder if the interface is not too limited. Looks very good !
        Hide
        Michael McCandless added a comment -

        Patch.

        Show
        Michael McCandless added a comment - Patch.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development