Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3
    • Component/s: search
    • Labels:
      None

      Description

      Uses LUCENE-997 to add time out support to Solr.

      1. SOLR-502.patch
        39 kB
        Sean Timm
      2. SOLR-502.patch
        22 kB
        Sean Timm
      3. SOLR-502.patch
        33 kB
        Sean Timm
      4. SOLR-502.patch
        33 kB
        Sean Timm
      5. SOLR-502.patch
        31 kB
        Sean Timm
      6. SOLR-502.patch
        31 kB
        Sean Timm
      7. solrTimeout.patch
        40 kB
        Sean Timm
      8. solrTimeout.patch
        37 kB
        Sean Timm
      9. solrTimeout.patch
        26 kB
        Sean Timm
      10. solrTimeout.patch
        22 kB
        Sean Timm
      11. solrTimeout.patch
        13 kB
        Sean Timm

        Issue Links

          Activity

          Hide
          Sean Timm added a comment -

          This patch adds a "timeallowed" parameter which takes a time out value in milliseconds. On a timeout, an exception is thrown from the searcher which results in a 500 error page with the time out exception message.

          I'd like to add support to return partial results, but I haven't done that part yet.

          Show
          Sean Timm added a comment - This patch adds a "timeallowed" parameter which takes a time out value in milliseconds. On a timeout, an exception is thrown from the searcher which results in a 500 error page with the time out exception message. I'd like to add support to return partial results, but I haven't done that part yet.
          Hide
          Sean Timm added a comment -

          This patch adds a partialResults flag which is set to true in the event of a timeout. A partial set of results will be returned (including possibly no results). The flag is supported in the XML, JSON, Ruby, and Python response writers.

          A count of the number of timeouts is included in the statistics similar to the errors count.

          Caveats/ToDo: SolrJ is not aware of this setting, nor is distributed search (SOLR-303). Some execution paths may not recognize partial results (such as sorting by field) as I haven't tested those yet.

          Show
          Sean Timm added a comment - This patch adds a partialResults flag which is set to true in the event of a timeout. A partial set of results will be returned (including possibly no results). The flag is supported in the XML, JSON, Ruby, and Python response writers. A count of the number of timeouts is included in the statistics similar to the errors count. Caveats/ToDo: SolrJ is not aware of this setting, nor is distributed search ( SOLR-303 ). Some execution paths may not recognize partial results (such as sorting by field) as I haven't tested those yet.
          Hide
          Sean Timm added a comment -

          Better handling of timeout condition with other code paths such as sorting by a field.

          Show
          Sean Timm added a comment - Better handling of timeout condition with other code paths such as sorting by a field.
          Hide
          Sean Timm added a comment -

          It looks like the recent work on playing nice with external HTTP caches (SOLR-127) will need to be aware of the timeout condition. I do not think a timeout should be cached. Currently an "HTTP/1.x 304 Not Modified" is returned. I'll try to work this into my next patch update.

          Show
          Sean Timm added a comment - It looks like the recent work on playing nice with external HTTP caches ( SOLR-127 ) will need to be aware of the timeout condition. I do not think a timeout should be cached. Currently an "HTTP/1.x 304 Not Modified" is returned. I'll try to work this into my next patch update.
          Hide
          Shalin Shekhar Mangar added a comment -

          An updated patch which contains changes to SolrJ to support search timeouts.

          Changes

          • SolrQuery has two new methods - setTimeAllowed and getTimeAllowed to specify timeout in milliseconds
          • SolrDocumentList has isPartialResult and setPartialResult to signal that a timeout occured and the results returned are partial
          • XMLResponseParser#readDocuments handles the partialResults boolean attribute sent by the server
          • SolrQueryTest has a trivial test for adding/removing the timeAllowed parameter
          Show
          Shalin Shekhar Mangar added a comment - An updated patch which contains changes to SolrJ to support search timeouts. Changes SolrQuery has two new methods - setTimeAllowed and getTimeAllowed to specify timeout in milliseconds SolrDocumentList has isPartialResult and setPartialResult to signal that a timeout occured and the results returned are partial XMLResponseParser#readDocuments handles the partialResults boolean attribute sent by the server SolrQueryTest has a trivial test for adding/removing the timeAllowed parameter
          Hide
          Shalin Shekhar Mangar added a comment -

          My previous patch wasn't generated correctly. This is a corrected patch.

          Show
          Shalin Shekhar Mangar added a comment - My previous patch wasn't generated correctly. This is a corrected patch.
          Hide
          Sean Timm added a comment -

          Timeouts should not be cached. This patch allows suppressing the generation of HTTP cache headers.

          Show
          Sean Timm added a comment - Timeouts should not be cached. This patch allows suppressing the generation of HTTP cache headers.
          Hide
          Sean Timm added a comment -

          This patch includes Shalin's SolrJ patch and includes the SOLR-505 patch. HTTP cache headers are now suppressed on a timeout.

          Show
          Sean Timm added a comment - This patch includes Shalin's SolrJ patch and includes the SOLR-505 patch. HTTP cache headers are now suppressed on a timeout.
          Hide
          Sean Timm added a comment -

          Added the ability to allows timeouts to occur on one or more shards in a distributed search (SOLR-303) and still be merged. The resulting set is reported as a partial result and is not cachable in an HTTP cache.

          This fixes the last known issue.

          Show
          Sean Timm added a comment - Added the ability to allows timeouts to occur on one or more shards in a distributed search ( SOLR-303 ) and still be merged. The resulting set is reported as a partial result and is not cachable in an HTTP cache. This fixes the last known issue.
          Hide
          patrick o'leary added a comment -

          Has this had any traction in the solr core yet? seems like a critical thing to have

          Show
          patrick o'leary added a comment - Has this had any traction in the solr core yet? seems like a critical thing to have
          Hide
          Otis Gospodnetic added a comment -

          Yes, I think we should get this in 1.3. I left the following comment in SOLR-505, but since this issue includes the patch from SOLR-505, I will assume the patch will be developed further as part of this issue and not SOLR-505.

          I took a quick look at the patch and saw this:

          rsp.setAvoidHttpCaching(false);

          Am I the only one who has a harder time reading negative methods like this, esp. when they take false?
          Would it not be nicer to just have:

          rsp.setHttpCaching(true/false);

          or even

          rsp.httpCachingOn() + rsp.httpCachingOff()

          Similarly, instead of

          isAvoidHttpCaching()

          have

          isHttpCachingOn()

          I know this is "just naming", but I think it helps with readability a bit.

          I notice the unit test mods are not in the patch. Is there no need to test the modified behaviour?

          Show
          Otis Gospodnetic added a comment - Yes, I think we should get this in 1.3. I left the following comment in SOLR-505 , but since this issue includes the patch from SOLR-505 , I will assume the patch will be developed further as part of this issue and not SOLR-505 . I took a quick look at the patch and saw this: rsp.setAvoidHttpCaching( false ); Am I the only one who has a harder time reading negative methods like this, esp. when they take false? Would it not be nicer to just have: rsp.setHttpCaching( true / false ); or even rsp.httpCachingOn() + rsp.httpCachingOff() Similarly, instead of isAvoidHttpCaching() have isHttpCachingOn() I know this is "just naming", but I think it helps with readability a bit. I notice the unit test mods are not in the patch. Is there no need to test the modified behaviour?
          Hide
          Otis Gospodnetic added a comment -

          The patch needs to be brought up to date with trunk. I believe a person new to this issue (but in need of the functionality) will try to do that tomorrow.

          [otis@localhost trunk]$ patch -p0 -i solrTimeout.patch --dry-run
          patching file src/java/org/apache/solr/search/DocSet.java
          patching file src/java/org/apache/solr/search/DocSlice.java
          patching file src/java/org/apache/solr/search/BitDocSet.java
          patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
          patching file src/java/org/apache/solr/search/HashDocSet.java
          patching file src/java/org/apache/solr/request/SolrQueryResponse.java
          Hunk #1 succeeded at 88 (offset 9 lines).
          Hunk #2 succeeded at 159 with fuzz 2 (offset -11 lines).
          patching file src/java/org/apache/solr/request/JSONResponseWriter.java
          patching file src/java/org/apache/solr/request/XMLWriter.java
          patching file src/java/org/apache/solr/common/params/CommonParams.java
          patching file src/java/org/apache/solr/common/SolrDocumentList.java
          Hunk #2 succeeded at 65 with fuzz 2 (offset 8 lines).
          patching file src/java/org/apache/solr/handler/RequestHandlerBase.java
          Hunk #3 FAILED at 43.
          Hunk #4 succeeded at 126 (offset 7 lines).
          Hunk #5 succeeded at 168 with fuzz 2.
          1 out of 5 hunks FAILED – saving rejects to file src/java/org/apache/solr/handler/RequestHandlerBase.java.rej
          patching file src/java/org/apache/solr/handler/component/SearchHandler.java
          Hunk #1 succeeded at 118 (offset -5 lines).
          Hunk #2 succeeded at 259 (offset 6 lines).
          patching file src/java/org/apache/solr/handler/component/QueryComponent.java
          patching file src/java/org/apache/solr/handler/SpellCheckerRequestHandler.java
          patching file src/java/org/apache/solr/handler/MoreLikeThisHandler.java
          patching file src/webapp/src/org/apache/solr/servlet/cache/Method.java
          patching file src/webapp/src/org/apache/solr/servlet/cache/HttpCacheHeaderUtil.java
          patching file src/webapp/src/org/apache/solr/servlet/SolrDispatchFilter.java
          Hunk #1 FAILED at 263.
          Hunk #2 FAILED at 282.
          2 out of 2 hunks FAILED – saving rejects to file src/webapp/src/org/apache/solr/servlet/SolrDispatchFilter.java.rej
          patching file client/java/solrj/test/org/apache/solr/client/solrj/SolrQueryTest.java
          patching file client/java/solrj/src/org/apache/solr/client/solrj/impl/XMLResponseParser.java
          Hunk #1 succeeded at 344 (offset 1 line).
          patching file client/java/solrj/src/org/apache/solr/client/solrj/SolrQuery.java

          Show
          Otis Gospodnetic added a comment - The patch needs to be brought up to date with trunk. I believe a person new to this issue (but in need of the functionality) will try to do that tomorrow. [otis@localhost trunk] $ patch -p0 -i solrTimeout.patch --dry-run patching file src/java/org/apache/solr/search/DocSet.java patching file src/java/org/apache/solr/search/DocSlice.java patching file src/java/org/apache/solr/search/BitDocSet.java patching file src/java/org/apache/solr/search/SolrIndexSearcher.java patching file src/java/org/apache/solr/search/HashDocSet.java patching file src/java/org/apache/solr/request/SolrQueryResponse.java Hunk #1 succeeded at 88 (offset 9 lines). Hunk #2 succeeded at 159 with fuzz 2 (offset -11 lines). patching file src/java/org/apache/solr/request/JSONResponseWriter.java patching file src/java/org/apache/solr/request/XMLWriter.java patching file src/java/org/apache/solr/common/params/CommonParams.java patching file src/java/org/apache/solr/common/SolrDocumentList.java Hunk #2 succeeded at 65 with fuzz 2 (offset 8 lines). patching file src/java/org/apache/solr/handler/RequestHandlerBase.java Hunk #3 FAILED at 43. Hunk #4 succeeded at 126 (offset 7 lines). Hunk #5 succeeded at 168 with fuzz 2. 1 out of 5 hunks FAILED – saving rejects to file src/java/org/apache/solr/handler/RequestHandlerBase.java.rej patching file src/java/org/apache/solr/handler/component/SearchHandler.java Hunk #1 succeeded at 118 (offset -5 lines). Hunk #2 succeeded at 259 (offset 6 lines). patching file src/java/org/apache/solr/handler/component/QueryComponent.java patching file src/java/org/apache/solr/handler/SpellCheckerRequestHandler.java patching file src/java/org/apache/solr/handler/MoreLikeThisHandler.java patching file src/webapp/src/org/apache/solr/servlet/cache/Method.java patching file src/webapp/src/org/apache/solr/servlet/cache/HttpCacheHeaderUtil.java patching file src/webapp/src/org/apache/solr/servlet/SolrDispatchFilter.java Hunk #1 FAILED at 263. Hunk #2 FAILED at 282. 2 out of 2 hunks FAILED – saving rejects to file src/webapp/src/org/apache/solr/servlet/SolrDispatchFilter.java.rej patching file client/java/solrj/test/org/apache/solr/client/solrj/SolrQueryTest.java patching file client/java/solrj/src/org/apache/solr/client/solrj/impl/XMLResponseParser.java Hunk #1 succeeded at 344 (offset 1 line). patching file client/java/solrj/src/org/apache/solr/client/solrj/SolrQuery.java
          Hide
          Otis Gospodnetic added a comment -

          Sean, do you think you can remove the changes that are a part of SOLR-505 from your patch, as mentioned here: https://issues.apache.org/jira/browse/SOLR-505?focusedCommentId=12598951#action_12598951

          Thanks.

          Show
          Otis Gospodnetic added a comment - Sean, do you think you can remove the changes that are a part of SOLR-505 from your patch, as mentioned here: https://issues.apache.org/jira/browse/SOLR-505?focusedCommentId=12598951#action_12598951 Thanks.
          Hide
          Sean Timm added a comment -

          Otis, I'd be happy to do so. Is there a way to generate a patch excluding the content of another patch without doing a manual editing of the patch file--which would be error prone? Or should I wait until SOLR-505 is committed?

          Show
          Sean Timm added a comment - Otis, I'd be happy to do so. Is there a way to generate a patch excluding the content of another patch without doing a manual editing of the patch file--which would be error prone? Or should I wait until SOLR-505 is committed?
          Hide
          Otis Gospodnetic added a comment -

          I think it's a manual thing, but super simple in this case. I'll commit SOLR-505 as soon as Thomas does the renaming, so if you want you can wait, svn up, see conflicts, and then manually remove conflicts and re-generate the patch.

          Show
          Otis Gospodnetic added a comment - I think it's a manual thing, but super simple in this case. I'll commit SOLR-505 as soon as Thomas does the renaming, so if you want you can wait, svn up, see conflicts, and then manually remove conflicts and re-generate the patch.
          Hide
          Yonik Seeley added a comment -

          whew... that's a lot of changes for timeouts (flag on DocSet, DocList, ResponseWriter changes, etc)

          It also seems like we shouldn't add any more conditionals to the inner loop HitCollector.collect().
          If it's a timed hit collector, perhaps just wrap the original hit collector so non-timed collectors don't take a speed hit.

          Show
          Yonik Seeley added a comment - whew... that's a lot of changes for timeouts (flag on DocSet, DocList, ResponseWriter changes, etc) It also seems like we shouldn't add any more conditionals to the inner loop HitCollector.collect(). If it's a timed hit collector, perhaps just wrap the original hit collector so non-timed collectors don't take a speed hit.
          Hide
          Sean Timm added a comment -

          Patch brought up to date with trunk. This patch no longer includes SOLR-505, but is dependent on it.

          Though I doubt the conditional check, even in the tight loop, has any performance impact, I was able to remove it while also improving the code readability.

          Show
          Sean Timm added a comment - Patch brought up to date with trunk. This patch no longer includes SOLR-505 , but is dependent on it. Though I doubt the conditional check, even in the tight loop, has any performance impact, I was able to remove it while also improving the code readability.
          Hide
          Otis Gospodnetic added a comment -
          • Does timeallowed=-1 mean "do not time out at all"? Is that mentioned anywhere? I also see a check for timeallowed > 0, so 0 also seems to mean "do not time out at all".
          • CamelCase: timeallowed => timeAllowed?
          • I see "This should only be called using either filterList or filter, but not both.", but I don't see a check for that. Should there be a test for the two vars?
          • I see System.out.println( "partialResults0: " + partialResults );

          The rest, from what I can tell, looks good.

          P.S.
          SOLR-502-solrj.patch is just an old patch that we can really remove so it doesn't confuse anyone, correct?

          Show
          Otis Gospodnetic added a comment - Does timeallowed=-1 mean "do not time out at all"? Is that mentioned anywhere? I also see a check for timeallowed > 0, so 0 also seems to mean "do not time out at all". CamelCase: timeallowed => timeAllowed? I see "This should only be called using either filterList or filter, but not both.", but I don't see a check for that. Should there be a test for the two vars? I see System.out.println( "partialResults0: " + partialResults ); The rest, from what I can tell, looks good. P.S. SOLR-502 -solrj.patch is just an old patch that we can really remove so it doesn't confuse anyone, correct?
          Hide
          Sean Timm added a comment - - edited
          • Added Javadoc note that a timeallowed param <=0 (or omitted) results in no timeout.
          • Fixed the "CamelCase: timeallowed => timeAllowed"
          • Removed the System.out.println(...) statements.

          I see "This should only be called using either filterList or filter, but not both.", but I don't see a check for that. Should there be a test for the two vars?

          This comment was copied from the existing getDocListC method (without the timeAllowed parameter). If there should be a sanity check there, it should probably be added as a separate JIRA issue.

          Show
          Sean Timm added a comment - - edited Added Javadoc note that a timeallowed param <=0 (or omitted) results in no timeout. Fixed the "CamelCase: timeallowed => timeAllowed" Removed the System.out.println(...) statements. I see "This should only be called using either filterList or filter, but not both.", but I don't see a check for that. Should there be a test for the two vars? This comment was copied from the existing getDocListC method (without the timeAllowed parameter). If there should be a sanity check there, it should probably be added as a separate JIRA issue.
          Hide
          Sean Timm added a comment -

          SOLR-502-solrj.patch is just an old patch that we can really remove so it doesn't confuse anyone, correct?

          Yes, this is an old patch which can be removed. The solrTimeout.patch files could be removed as well if they are found to be confusing.

          Show
          Sean Timm added a comment - SOLR-502 -solrj.patch is just an old patch that we can really remove so it doesn't confuse anyone, correct? Yes, this is an old patch which can be removed. The solrTimeout.patch files could be removed as well if they are found to be confusing.
          Hide
          Shalin Shekhar Mangar added a comment -

          I've removed the SOLR-502-solrj.patch as per the above comments.

          Show
          Shalin Shekhar Mangar added a comment - I've removed the SOLR-502 -solrj.patch as per the above comments.
          Hide
          Otis Gospodnetic added a comment -

          I finally had the chance to apply this locally and try it out. I have not been able to get this time out business to kick in, though. Here is what I did so far, after applying the patch, and clean dist and deployment of solr war.

          I set up 2 Solr instances (actually 1 Jetty with 2 Solr homes defined via JNDI). Identical schema, each index with 100K docs.

          I then hit one of the instances, specifying both shards and asked for q=title:a* (expensive query), while using timeAllowed=1, like this:

          curl --silent 'http://localhost:8080/solr1/select/?q=title%3Aa*&version=2.2&start=0&rows=1000&indent=on&shards=localhost:8080/solr1,localhost:8080/solr2&timeAllowed=1' | less
          

          ....Aaaarg, I see one problem. That "timeAllowed" is specified as "timeallowed":

          [otis@localhost SOLR-502]$ grep TIME_ALLOW SOLR-502.patch  | head -1
          +  public static final String TIME_ALLOWED = "timeallowed";
          

          Sean, I think this should be camelCase, too.

          OK, so changing that:

          curl --silent 'http://localhost:8080/solr1/select/?q=title%3Aa*&version=2.2&start=0&rows=1000&indent=on&shards=localhost:8080/solr1,localhost:8080/solr2&timeallowed=1' | less
          

          However, I am still unable to get the timeout to happen. I see QTime of 257 in the response, clearly above timeallowed=1. If timeallowed=1, should I ever be seeing QTime over 1?

          <lst name="responseHeader">
           <int name="status">0</int>
           <int name="QTime">257</int>
           <lst name="params">
            <str name="shards">localhost:8080/solr1,localhost:8080/solr2</str>
            <str name="indent">on</str>
            <str name="start">0</str>
            <str name="q">title:a*</str>
            <str name="timeallowed">1</str>
            <str name="version">2.2</str>
            <str name="rows">10</str>
           </lst>
          </lst>
          <result name="response" numFound="50936" start="0">
          

          I also grepped the output for "partial" and never find anything. Am I doing something wrong?
          I also see the latest SOLR-502.patch still has some print statements, so I looked at stdout, but nothing is getting printed there.

          I'll see if I can trace this, but if I did something wrong or see a bug in your code, I'm all eyes.

          Show
          Otis Gospodnetic added a comment - I finally had the chance to apply this locally and try it out. I have not been able to get this time out business to kick in, though. Here is what I did so far, after applying the patch, and clean dist and deployment of solr war. I set up 2 Solr instances (actually 1 Jetty with 2 Solr homes defined via JNDI). Identical schema, each index with 100K docs. I then hit one of the instances, specifying both shards and asked for q=title:a* (expensive query), while using timeAllowed=1, like this: curl --silent 'http: //localhost:8080/solr1/select/?q=title%3Aa*&version=2.2&start=0&rows=1000&indent=on&shards=localhost:8080/solr1,localhost:8080/solr2&timeAllowed=1' | less ....Aaaarg, I see one problem. That "timeAllowed" is specified as "timeallowed": [otis@localhost SOLR-502]$ grep TIME_ALLOW SOLR-502.patch | head -1 + public static final String TIME_ALLOWED = "timeallowed" ; Sean, I think this should be camelCase, too. OK, so changing that: curl --silent 'http: //localhost:8080/solr1/select/?q=title%3Aa*&version=2.2&start=0&rows=1000&indent=on&shards=localhost:8080/solr1,localhost:8080/solr2&timeallowed=1' | less However, I am still unable to get the timeout to happen. I see QTime of 257 in the response, clearly above timeallowed=1. If timeallowed=1, should I ever be seeing QTime over 1? <lst name= "responseHeader" > <int name= "status" > 0 </int> <int name= "QTime" > 257 </int> <lst name= "params" > <str name= "shards" > localhost:8080/solr1,localhost:8080/solr2 </str> <str name= "indent" > on </str> <str name= "start" > 0 </str> <str name= "q" > title:a* </str> <str name= "timeallowed" > 1 </str> <str name= "version" > 2.2 </str> <str name= "rows" > 10 </str> </lst> </lst> <result name= "response" numFound= "50936" start= "0" > I also grepped the output for "partial" and never find anything. Am I doing something wrong? I also see the latest SOLR-502 .patch still has some print statements, so I looked at stdout, but nothing is getting printed there. I'll see if I can trace this, but if I did something wrong or see a bug in your code, I'm all eyes.
          Hide
          Otis Gospodnetic added a comment - - edited

          Just in case it helps:

          • I used solrconfig.xml from example/solr/conf
          • I set all cache sizes to 0

          Also, after adding some lovely print statements to the patch, right in the QueryComponent's process method, it looks like my query requests are not even executing QueryComponent's process method. Is there something I need to enable to get QueryComponent included? Standard request handler should be using it already, no?

          Show
          Otis Gospodnetic added a comment - - edited Just in case it helps: I used solrconfig.xml from example/solr/conf I set all cache sizes to 0 Also, after adding some lovely print statements to the patch, right in the QueryComponent's process method, it looks like my query requests are not even executing QueryComponent's process method. Is there something I need to enable to get QueryComponent included? Standard request handler should be using it already, no?
          Hide
          Sean Timm added a comment -
          • Adds partialResults support to the binary response, which is used by distributed search.
          • Really removes the System.out.println() this time.
          • timeallowed param is now camelcase (timeAllowed).
          Show
          Sean Timm added a comment - Adds partialResults support to the binary response, which is used by distributed search. Really removes the System.out.println() this time. timeallowed param is now camelcase (timeAllowed).
          Hide
          Sean Timm added a comment -

          Sorry about the timeallowed parameter. For some reason I had in my head that the parameters were not supposed to be camel case and I only switched the parameter variable names.

          You should be seeing a log message similar to:

          WARNING: Query: title:s*; Elapsed time: 20Exceeded allowed search time: 1 ms.
          

          even with the previous patch. Though, when using distributed search, the new binary response is used which I hadn't modified to include partial results support. It should work with this new patch.

          <lst name="responseHeader">
           <int name="status">0</int>
           <int name="QTime">39</int>
           <lst name="params">
            <str name="shards">naan.office.aol.com:8973/solr,naan.office.aol.com:8993/solr</str>
            <str name="indent">on</str>
          
            <str name="start">0</str>
            <str name="q">headline:s*</str>
            <str name="timeAllowed">1</str>
            <str name="version">2.2</str>
            <str name="rows">100</str>
           </lst>
          
          </lst>
          <result name="response" numFound="0" start="0" partialResults="true"/>
          

          If timeallowed=1, should I ever be seeing QTime over 1?

          Yes, the TimeLimitedCollector can only interrupt searches during the collect() calls. Other, sometimes substantial, work is done outside of the collect().

          Also, see the note in the TimeLimitedCollector.setResolution(long) Javadocs http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/search/TimeLimitedCollector.html#setResolution(long)

          Show
          Sean Timm added a comment - Sorry about the timeallowed parameter. For some reason I had in my head that the parameters were not supposed to be camel case and I only switched the parameter variable names. You should be seeing a log message similar to: WARNING: Query: title:s*; Elapsed time: 20Exceeded allowed search time: 1 ms. even with the previous patch. Though, when using distributed search, the new binary response is used which I hadn't modified to include partial results support. It should work with this new patch. <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">39</int> <lst name="params"> <str name="shards">naan.office.aol.com:8973/solr,naan.office.aol.com:8993/solr</str> <str name="indent">on</str> <str name="start">0</str> <str name="q">headline:s*</str> <str name="timeAllowed">1</str> <str name="version">2.2</str> <str name="rows">100</str> </lst> </lst> <result name="response" numFound="0" start="0" partialResults="true"/> If timeallowed=1, should I ever be seeing QTime over 1? Yes, the TimeLimitedCollector can only interrupt searches during the collect() calls. Other, sometimes substantial, work is done outside of the collect(). Also, see the note in the TimeLimitedCollector.setResolution(long) Javadocs http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/search/TimeLimitedCollector.html#setResolution(long )
          Hide
          Noble Paul added a comment -

          Sean: For the namedListCodec changes to be backward compatible (within 1.3) add check the list size before calling a list.get()

          if(list.size() > 3)  solrDocs.setPartialResult((Boolean)list.get(3));
          
          Show
          Noble Paul added a comment - Sean: For the namedListCodec changes to be backward compatible (within 1.3) add check the list size before calling a list.get() if (list.size() > 3) solrDocs.setPartialResult(( Boolean )list.get(3));
          Hide
          Sean Timm added a comment -

          This patch adds a conditional to ensure backwards compatibility within SOLR 1.3 nightly builds, per Noble Paul's suggestion. Is that necessary?

          Show
          Sean Timm added a comment - This patch adds a conditional to ensure backwards compatibility within SOLR 1.3 nightly builds, per Noble Paul's suggestion. Is that necessary?
          Hide
          Otis Gospodnetic added a comment -

          Just a quick note that the patch now does produce partial results.

          I see two "problems" with time-limiting search, but they are mostly general and not all are directly related to this patch:

          1. lots of work can be done outside collect, so the QTime can be radically higher than timeAllowed (e.g. I have timeAllowed=50, but I'm seeing QTime of over 2000)
          2. the number of hits will vary for the same timeAllowed and the same query. This may not be good for apps that want to show the exact number of hits in the UI.

          Still, I think having the option of timing out long searches is a good thing.
          I'm +1 on committing this.

          Show
          Otis Gospodnetic added a comment - Just a quick note that the patch now does produce partial results. I see two "problems" with time-limiting search, but they are mostly general and not all are directly related to this patch: lots of work can be done outside collect, so the QTime can be radically higher than timeAllowed (e.g. I have timeAllowed=50, but I'm seeing QTime of over 2000) the number of hits will vary for the same timeAllowed and the same query. This may not be good for apps that want to show the exact number of hits in the UI. Still, I think having the option of timing out long searches is a good thing. I'm +1 on committing this.
          Hide
          Otis Gospodnetic added a comment -

          Yonik, I think Sean addressed the 2 issues from your May 22 comments. Here is svn st:

          M src/java/org/apache/solr/search/DocSet.java
          M src/java/org/apache/solr/search/DocSlice.java
          M src/java/org/apache/solr/search/BitDocSet.java
          M src/java/org/apache/solr/search/SolrIndexSearcher.java
          M src/java/org/apache/solr/search/HashDocSet.java
          M src/java/org/apache/solr/common/params/CommonParams.java
          M src/java/org/apache/solr/common/SolrDocumentList.java
          M src/java/org/apache/solr/common/util/NamedListCodec.java
          M src/java/org/apache/solr/request/BinaryResponseWriter.java
          M src/java/org/apache/solr/request/JSONResponseWriter.java
          M src/java/org/apache/solr/request/XMLWriter.java
          M src/java/org/apache/solr/handler/RequestHandlerBase.java
          M src/java/org/apache/solr/handler/component/QueryComponent.java
          M client/java/solrj/test/org/apache/solr/client/solrj/SolrQueryTest.java
          M client/java/solrj/src/org/apache/solr/client/solrj/impl/XMLResponseParser.java
          M client/java/solrj/src/org/apache/solr/client/solrj/SolrQuery.java

          I'll commit next week if nobody objects.

          Show
          Otis Gospodnetic added a comment - Yonik, I think Sean addressed the 2 issues from your May 22 comments. Here is svn st: M src/java/org/apache/solr/search/DocSet.java M src/java/org/apache/solr/search/DocSlice.java M src/java/org/apache/solr/search/BitDocSet.java M src/java/org/apache/solr/search/SolrIndexSearcher.java M src/java/org/apache/solr/search/HashDocSet.java M src/java/org/apache/solr/common/params/CommonParams.java M src/java/org/apache/solr/common/SolrDocumentList.java M src/java/org/apache/solr/common/util/NamedListCodec.java M src/java/org/apache/solr/request/BinaryResponseWriter.java M src/java/org/apache/solr/request/JSONResponseWriter.java M src/java/org/apache/solr/request/XMLWriter.java M src/java/org/apache/solr/handler/RequestHandlerBase.java M src/java/org/apache/solr/handler/component/QueryComponent.java M client/java/solrj/test/org/apache/solr/client/solrj/SolrQueryTest.java M client/java/solrj/src/org/apache/solr/client/solrj/impl/XMLResponseParser.java M client/java/solrj/src/org/apache/solr/client/solrj/SolrQuery.java I'll commit next week if nobody objects.
          Hide
          Yonik Seeley added a comment -

          From SOLR-303:

          Perhaps a string should also be added indicating why all results were not able to be returned.

          If we had that (perhaps in the response header) would there still be a need to have this partial results flag on DocSet/DocList? It always felt a little wrong that this feature wormed it's way to that low of a level (DocSet, response writers, response parsers, etc). Seems like it could/should me much simpler.

          Show
          Yonik Seeley added a comment - From SOLR-303 : Perhaps a string should also be added indicating why all results were not able to be returned. If we had that (perhaps in the response header) would there still be a need to have this partial results flag on DocSet/DocList? It always felt a little wrong that this feature wormed it's way to that low of a level (DocSet, response writers, response parsers, etc). Seems like it could/should me much simpler.
          Hide
          Sean Timm added a comment -

          Yonik--

          Do you have a suggestion on how to get it into the response header? That isn't available down at the SolrIndexSearcher level as far as I can tell. It would be much easier if the ResponseBuilder or some other object was passed all the way down to the searcher level, but I was trying to make the smallest change possible.

          I think an easy machine readable value to indicate partial results is important. I think a descriptive string is optional, but would be a nice addition.

          -Sean

          Show
          Sean Timm added a comment - Yonik-- Do you have a suggestion on how to get it into the response header? That isn't available down at the SolrIndexSearcher level as far as I can tell. It would be much easier if the ResponseBuilder or some other object was passed all the way down to the searcher level, but I was trying to make the smallest change possible. I think an easy machine readable value to indicate partial results is important. I think a descriptive string is optional, but would be a nice addition. -Sean
          Hide
          Yonik Seeley added a comment -

          Do you have a suggestion on how to get it into the response header? That isn't available down at the SolrIndexSearcher level as far as I can tell.

          Off the top of my head, it seems like it might be cleaner to throw an exception in the SolrIndexSearcher method doing the searching (that would have the added benefit of automatically bypassing DocSet/DocList caching, etc).

          Catch that exception in the query component and set a flag in the header indicating that a timeout happened.

          Or if it's easier, pass more info down to the SolrIndexSearcher.

          After all, this only handles timeouts at the query level (not query expansion/rewriting, highlighting, retrieving stored fields, faceting, or any other number of components that will be added in the future). It also doesn't even totally handle timeouts at the query level... one could construct a query that takes a lot of time yet matches no documents so there is never an opportunity to time out. Then there is the issue of false positives (a major GC compaction hits and causes all the currently running queries to time out). Given these restrictions, and the fact that most people would choose not to get partial results, it seems like we should really try to limit the impact/invasiveness of this feature.

          Show
          Yonik Seeley added a comment - Do you have a suggestion on how to get it into the response header? That isn't available down at the SolrIndexSearcher level as far as I can tell. Off the top of my head, it seems like it might be cleaner to throw an exception in the SolrIndexSearcher method doing the searching (that would have the added benefit of automatically bypassing DocSet/DocList caching, etc). Catch that exception in the query component and set a flag in the header indicating that a timeout happened. Or if it's easier, pass more info down to the SolrIndexSearcher. After all, this only handles timeouts at the query level (not query expansion/rewriting, highlighting, retrieving stored fields, faceting, or any other number of components that will be added in the future). It also doesn't even totally handle timeouts at the query level... one could construct a query that takes a lot of time yet matches no documents so there is never an opportunity to time out. Then there is the issue of false positives (a major GC compaction hits and causes all the currently running queries to time out). Given these restrictions, and the fact that most people would choose not to get partial results, it seems like we should really try to limit the impact/invasiveness of this feature.
          Hide
          Yonik Seeley added a comment -

          one could construct a query that takes a lot of time yet matches no documents...

          Actually, thinking a little further on this point, some of the longest queries are range queries or prefix queries. Solr uses the constant scoring variety of these, where all of the matching documents are collected up front. So these queries would never be interrupted in the middle, but only at the end after the majority of the work had been done.

          Show
          Yonik Seeley added a comment - one could construct a query that takes a lot of time yet matches no documents... Actually, thinking a little further on this point, some of the longest queries are range queries or prefix queries. Solr uses the constant scoring variety of these, where all of the matching documents are collected up front. So these queries would never be interrupted in the middle, but only at the end after the majority of the work had been done.
          Hide
          Sean Timm added a comment -

          I've been thinking about putting the timeout info in the response header. Throwing an exception from the searcher will not work because that sacrifices the ability to get partial results. I really feel that having the partial results flag as an attribute on the response tag makes more sense than putting it in the resonse header as the partial results pertains to the results section of the response. I will create an alternate patch however with the partial results flag in the response header to compare the two methods.

          Show
          Sean Timm added a comment - I've been thinking about putting the timeout info in the response header. Throwing an exception from the searcher will not work because that sacrifices the ability to get partial results. I really feel that having the partial results flag as an attribute on the response tag makes more sense than putting it in the resonse header as the partial results pertains to the results section of the response. I will create an alternate patch however with the partial results flag in the response header to compare the two methods.
          Hide
          Yonik Seeley added a comment -

          I'm trying to think of Solr installations I've seen with problems, where this would be the solution I would recommend... but I can't say I can think of any (hence my hesitation, wondering if this is more of a solution in search of a problem).

          What is the underlying problem with some queries taking longer than others?
          Is it more for the server side (don't spend too much time on any 1 query), or for
          the client side (control latency, even if result set is partial).

          I have seen problems where queries started stacking up:
          http://www.nabble.com/TermInfosReader-lazy-term-index-reading-to8772766.html#a8775141
          But this would not have fixed the problem.

          If timeouts and partial results are expected to happen regularly (a non-trivial % of queries), then one needs to start worrying about the bias introduced (it's always low doc numbers that will be returned, which will be older documents if one is doing incremental indexing). It seems like a better solution is to increase hardware dedicated to the search collection.

          If timeouts and partial results are expected to be rare, then it should have little impact on the overall server load, so why worry about them?

          Thoughts?

          Show
          Yonik Seeley added a comment - I'm trying to think of Solr installations I've seen with problems, where this would be the solution I would recommend... but I can't say I can think of any (hence my hesitation, wondering if this is more of a solution in search of a problem). What is the underlying problem with some queries taking longer than others? Is it more for the server side (don't spend too much time on any 1 query), or for the client side (control latency, even if result set is partial). I have seen problems where queries started stacking up: http://www.nabble.com/TermInfosReader-lazy-term-index-reading-to8772766.html#a8775141 But this would not have fixed the problem. If timeouts and partial results are expected to happen regularly (a non-trivial % of queries), then one needs to start worrying about the bias introduced (it's always low doc numbers that will be returned, which will be older documents if one is doing incremental indexing). It seems like a better solution is to increase hardware dedicated to the search collection. If timeouts and partial results are expected to be rare, then it should have little impact on the overall server load, so why worry about them? Thoughts?
          Hide
          Ian Holsman added a comment -

          Hi Yonik.

          The scenario I always come up with is when a developer launches something into production without properly testing it out on a large size index and fluffs the query.

          Without a timeout/partial result he will bring the site down quite quickly and it will stay down until the operations guys roll it out again.

          Show
          Ian Holsman added a comment - Hi Yonik. The scenario I always come up with is when a developer launches something into production without properly testing it out on a large size index and fluffs the query. Without a timeout/partial result he will bring the site down quite quickly and it will stay down until the operations guys roll it out again.
          Hide
          Yonik Seeley added a comment -

          The scenario I always come up with is when a developer launches something into production without properly testing it out on a large size index and fluffs the query.

          Heh... yup I remember seeing a couple of those.
          Unfortunately the ones I remember wouldn't have been saved by this patch because the "bad" part of the query was an expensive range query (or once a prefix query) that wasn't pulled out into a separate "fq".

          Show
          Yonik Seeley added a comment - The scenario I always come up with is when a developer launches something into production without properly testing it out on a large size index and fluffs the query. Heh... yup I remember seeing a couple of those. Unfortunately the ones I remember wouldn't have been saved by this patch because the "bad" part of the query was an expensive range query (or once a prefix query) that wasn't pulled out into a separate "fq".
          Hide
          Sean Timm added a comment -

          Changes the setting of the partialResults flag from the results to the responseHeader.

          Show
          Sean Timm added a comment - Changes the setting of the partialResults flag from the results to the responseHeader.
          Hide
          Yonik Seeley added a comment -

          Thanks Sean, that definitely cuts down the patch size, and it seems nicer not to be touching DocSet and ResponseWriters, etc. What's your take?

          Another thing to consider is perhaps a SolrIndexSearcher.search() method that uses a command pattern to avoid having to always change the signatures when we want to pass something new in or out? It might be more natural than passing down an un-typed NamedList

          QueryCommand {
            Query q
            Sort s
            List<Query> filters
            DocSet filter
            int flags
            int timeout
            ...
          }
          
          QueryResult {
            DocList list
            DocSet set
            boolean timedOut
            ...
          }
          
          Show
          Yonik Seeley added a comment - Thanks Sean, that definitely cuts down the patch size, and it seems nicer not to be touching DocSet and ResponseWriters, etc. What's your take? Another thing to consider is perhaps a SolrIndexSearcher.search() method that uses a command pattern to avoid having to always change the signatures when we want to pass something new in or out? It might be more natural than passing down an un-typed NamedList QueryCommand { Query q Sort s List<Query> filters DocSet filter int flags int timeout ... } QueryResult { DocList list DocSet set boolean timedOut ... }
          Hide
          Sean Timm added a comment -

          The timeout is to protect the server side. The client side can be largely protected by setting a read timeout, but if the client aborts before the server responds, the server is just wasting resources processing a request that will never be used. The partial results is useful in a couple of scenarios, probably the most important is a large distributed complex where you would rather get whatever results you can from a slow shard rather than throw them away.

          As a real world example, the query "contact us about our site" on a 2.3MM document index (partial Dmoz crawl) takes several seconds to complete, while the mean response time is sub 50 ms. We've had cases where a bot walks the next page links (including expensive queries such as this). Also users are prone to repeatedly click the query button if they get impatient on a slow site. Without a server side timeout, this is a real issue.

          Rate limiting and limiting the number of next pages that can be fetched at the front end are also part of the solution to the above example.

          Show
          Sean Timm added a comment - The timeout is to protect the server side. The client side can be largely protected by setting a read timeout, but if the client aborts before the server responds, the server is just wasting resources processing a request that will never be used. The partial results is useful in a couple of scenarios, probably the most important is a large distributed complex where you would rather get whatever results you can from a slow shard rather than throw them away. As a real world example, the query "contact us about our site" on a 2.3MM document index (partial Dmoz crawl) takes several seconds to complete, while the mean response time is sub 50 ms. We've had cases where a bot walks the next page links (including expensive queries such as this). Also users are prone to repeatedly click the query button if they get impatient on a slow site. Without a server side timeout, this is a real issue. Rate limiting and limiting the number of next pages that can be fetched at the front end are also part of the solution to the above example.
          Hide
          Sean Timm added a comment -

          that definitely cuts down the patch size [...] What's your take?

          Before I made the change, I was against it as it seems more logical to have the partialResults associated with the results list, where the total count, etc. are. But this greatly simplifies the patch. I could go either way.

          Another thing to consider is perhaps a SolrIndexSearcher.search() method that uses a command pattern

          I think I agree. How is this different than my suggestion of passing the ResponseBuilder into the searcher? It seems that it would be useful to take it even a step further and pass the ResonseBuilder object around everywhere including the response handlers and writers.

          Show
          Sean Timm added a comment - that definitely cuts down the patch size [...] What's your take? Before I made the change, I was against it as it seems more logical to have the partialResults associated with the results list, where the total count, etc. are. But this greatly simplifies the patch. I could go either way. Another thing to consider is perhaps a SolrIndexSearcher.search() method that uses a command pattern I think I agree. How is this different than my suggestion of passing the ResponseBuilder into the searcher? It seems that it would be useful to take it even a step further and pass the ResonseBuilder object around everywhere including the response handlers and writers.
          Hide
          Yonik Seeley added a comment -

          I see your point on the overlap between something like QueryCommand and ResponseBuilder... but ResponseBuilder feels like it's at a higher level. Say that a custom component or handler wants to do a couple queries... or different queries depending on the results of the first query (or something like unsupervised feedback). Should a different ResponseBuilder object be built for each query that is part of a request/response?

          ResponseBuilder is also a bit big and ill-defined (but currently gets the job done for communication between different query components). Upgrading it to serve as something you pass to a SolrIndexSearcher to do a query doesn't feel quite right (or at least that's not the way I've been thinking about it).

          Show
          Yonik Seeley added a comment - I see your point on the overlap between something like QueryCommand and ResponseBuilder... but ResponseBuilder feels like it's at a higher level. Say that a custom component or handler wants to do a couple queries... or different queries depending on the results of the first query (or something like unsupervised feedback). Should a different ResponseBuilder object be built for each query that is part of a request/response? ResponseBuilder is also a bit big and ill-defined (but currently gets the job done for communication between different query components). Upgrading it to serve as something you pass to a SolrIndexSearcher to do a query doesn't feel quite right (or at least that's not the way I've been thinking about it).
          Hide
          Sean Timm added a comment -

          Are you thinking the command pattern would be the preferred way of doing a SolrIndexSearcher.search(), possibly even deprecating the existing methods? I think that makes sense, but seems to be a big change to add to this patch. I think I'd prefer to see it fleshed out in a separate issue. The methods returning Hits should probably be deprecated in Solr 1.3 anyway since Hits is going away in Lucene 3.0.

          Show
          Sean Timm added a comment - Are you thinking the command pattern would be the preferred way of doing a SolrIndexSearcher.search(), possibly even deprecating the existing methods? I think that makes sense, but seems to be a big change to add to this patch. I think I'd prefer to see it fleshed out in a separate issue. The methods returning Hits should probably be deprecated in Solr 1.3 anyway since Hits is going away in Lucene 3.0.
          Hide
          Yonik Seeley added a comment -

          I think that makes sense, but seems to be a big change to add to this patch.

          The patch already changes (or adds to) the API. So instead of passing down extra parameters (timeout and NamedList), pass down an object that encapsulates all the parameters. Deprecations can wait for another day.

          The primary motivation is that it seems messy passing down the un-typed NamedList<Object> and having SolrIndexSearcher set things in the header (rather than the QueryComponent do it).

          Show
          Yonik Seeley added a comment - I think that makes sense, but seems to be a big change to add to this patch. The patch already changes (or adds to) the API. So instead of passing down extra parameters (timeout and NamedList), pass down an object that encapsulates all the parameters. Deprecations can wait for another day. The primary motivation is that it seems messy passing down the un-typed NamedList<Object> and having SolrIndexSearcher set things in the header (rather than the QueryComponent do it).
          Hide
          Sean Timm added a comment -

          Added a SolrIndexSearcher.search() method that uses a command pattern.

          Show
          Sean Timm added a comment - Added a SolrIndexSearcher.search() method that uses a command pattern.
          Hide
          Otis Gospodnetic added a comment -

          I'm a bit out of touch with this one now (vacationnnnnnn), but should this patch contain solrj changes as well?
          (a new method for max allowed time?)

          Show
          Otis Gospodnetic added a comment - I'm a bit out of touch with this one now (vacationnnnnnn), but should this patch contain solrj changes as well? (a new method for max allowed time?)
          Hide
          Yonik Seeley added a comment -

          That was a bigger change set than I had anticipated - I thought perhaps of just introducing the QueryCommand and passing an extra param to the non-public method(s) to minimize changes. Things look good though, and I just committed this patch.
          Thanks Sean!

          Show
          Yonik Seeley added a comment - That was a bigger change set than I had anticipated - I thought perhaps of just introducing the QueryCommand and passing an extra param to the non-public method(s) to minimize changes. Things look good though, and I just committed this patch. Thanks Sean!
          Hide
          Sean Timm added a comment -

          Yonik--

          I just noticed that the Javadoc specifies "Long", but the setTimeAllowed function takes an Integer in org.apache.solr.client.solrj.SolrQuery.

          Thanks,
          Sean

          Show
          Sean Timm added a comment - Yonik-- I just noticed that the Javadoc specifies "Long", but the setTimeAllowed function takes an Integer in org.apache.solr.client.solrj.SolrQuery. Thanks, Sean
          Hide
          Yonik Seeley added a comment -

          Hmmm, so should we change timeAllowed to an int everywhere (1.6 years worth of timeout) or add getLong() methods to SolrParams?

          Show
          Yonik Seeley added a comment - Hmmm, so should we change timeAllowed to an int everywhere (1.6 years worth of timeout) or add getLong() methods to SolrParams?
          Hide
          Sean Timm added a comment -

          I think that the code is fine as is. Just the Javadoc comment needs to be changed. The Integer is explicitly cast to a long when it is used. And as you note, 1.6 years is plenty long enough.

          Show
          Sean Timm added a comment - I think that the code is fine as is. Just the Javadoc comment needs to be changed. The Integer is explicitly cast to a long when it is used. And as you note, 1.6 years is plenty long enough.
          Hide
          Magne Groenhuis added a comment -

          I am using the trunk version of Solr and split a lucene index in 2 (2 GB) shards hosting them on 2 different ports.
          Even though i know work is being done outside the timeAllowed period, i get some extreme numbers giving the user impression that timeAllowed is not actually doing anything.

          INFO: webapp=/solr path=/select params=

          {fl=bla1&shards=localhost:8983/solr,localhost:7574/solr&start=65850&q=bla2&timeAllowed=1000&wt=javabin&rows=5&version=2.2}

          status=0 QTime=331390

          I am browsing to the end of some search result. I assume that under the hood an extreme amount of id's with scores have to be sent to the merge machine, but i was hoping that the timeAllowed parameter would limit the amount of time seacrhed on the shards and thus limiting the time for the client. Possible resulting in getting no result (because of the browsing so far to the end of a large search result).

          But still the number 1000 and 331390 are a bit far apart.

          Any suggestions or do i have to give some more data?

          Show
          Magne Groenhuis added a comment - I am using the trunk version of Solr and split a lucene index in 2 (2 GB) shards hosting them on 2 different ports. Even though i know work is being done outside the timeAllowed period, i get some extreme numbers giving the user impression that timeAllowed is not actually doing anything. INFO: webapp=/solr path=/select params= {fl=bla1&shards=localhost:8983/solr,localhost:7574/solr&start=65850&q=bla2&timeAllowed=1000&wt=javabin&rows=5&version=2.2} status=0 QTime=331390 I am browsing to the end of some search result. I assume that under the hood an extreme amount of id's with scores have to be sent to the merge machine, but i was hoping that the timeAllowed parameter would limit the amount of time seacrhed on the shards and thus limiting the time for the client. Possible resulting in getting no result (because of the browsing so far to the end of a large search result). But still the number 1000 and 331390 are a bit far apart. Any suggestions or do i have to give some more data?
          Hide
          Otis Gospodnetic added a comment -

          Magne,
          I think the problem is that start=65850. You have rows=5, so I think result merging is not the problem.
          I'm not sure what exactly happens outside of the collect call (the part that can be time-limited), but that's clearly costing you time.
          Going deep in search results is a problem for all search engines, as war as I know. Try going beyond 1000 matches in Google.
          If you are OK with not returning any results to the client, then why not put the timeout around that client's call to Solr?

          Show
          Otis Gospodnetic added a comment - Magne, I think the problem is that start=65850. You have rows=5, so I think result merging is not the problem. I'm not sure what exactly happens outside of the collect call (the part that can be time-limited), but that's clearly costing you time. Going deep in search results is a problem for all search engines, as war as I know. Try going beyond 1000 matches in Google. If you are OK with not returning any results to the client, then why not put the timeout around that client's call to Solr?

            People

            • Assignee:
              Yonik Seeley
              Reporter:
              Sean Timm
            • Votes:
              3 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development