Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      Patch Available

      Description

      This patch is based on Nutch-308.

      This patch adds support for a maximum search time limit. After this time is exceeded, the search thread is stopped, partial results (if any) are returned and the total number of results is estimated.

      This patch tries to minimize the overhead related to time-keeping by using a version of safe unsynchronized timer.

      This was also discussed in an e-mail thread.
      http://www.nabble.com/search-timeout-tf3410206.html#a9501029

      1. HitCollectorTimeoutDecorator.java
        1 kB
        Timo Nentwig
      2. LuceneTimeoutTest.java
        2 kB
        Sean Timm
      3. LuceneTimeoutTest.java
        2 kB
        Sean Timm
      4. MyHitCollector.java
        0.3 kB
        Timo Nentwig
      5. timeout.patch
        16 kB
        Doron Cohen
      6. timeout.patch
        16 kB
        Doron Cohen
      7. timeout.patch
        16 kB
        Doron Cohen
      8. timeout.patch
        10 kB
        Sean Timm
      9. timeout.patch
        4 kB
        Sean Timm
      10. timeout.patch
        5 kB
        Sean Timm
      11. timeout.patch
        15 kB
        Sean Timm
      12. timeout.patch
        14 kB
        Sean Timm
      13. TimerThreadTest.java
        5 kB
        Sean Timm

        Issue Links

          Activity

          Hide
          timmsc Sean Timm added a comment -

          Patch against trunk revision 575451.

          Show
          timmsc Sean Timm added a comment - Patch against trunk revision 575451.
          Hide
          timmsc Sean Timm added a comment -

          Simple test case. Run by passing in the index directory as an argument.

          Show
          timmsc Sean Timm added a comment - Simple test case. Run by passing in the index directory as an argument.
          Hide
          timmsc Sean Timm added a comment -

          Here are some additional details on the changes.

          New files:
          TimeLimitedCollector.java

          Extends HitCollector and detects timeouts resulting in a TimeLimitedCollector.TimeExceeded exception being thrown.

          TimerThread.java

          TimerThread provides a pseudo-clock service to all searching threads, so that they can count elapsed time with less overhead than repeatedly calling System.currentTimeMillis. A single thread should be created to be used for all searches.

          Modified Files:
          Hits.java

          Added partial result flag.

          IndexSearcher.java

          Catches TimeLimitedCollector.TimeExceeded, sets partial results flag on TopDocs and estimates the total hit count (if we hadn't timed out partway through). Returns TopDocs with partial results.

          Searcher.java

          Added methods to set and get the timeout parameters. This implementation decision has the limitation of only permitting a single timeout value per Searcher instance (of which there is usually only one). However, this greatly minimizes the number of search methods that would need to be added. In practice, I have not needed the functionality to change the timeout settings on a per query basis.

          TopFieldDocCollector.java

          Uses TimeLimitedCollector functionality.

          TopDocCollector.java

          Uses TimeLimitedCollector functionality and exposes it to child class TopFieldDocCollector.

          TopDocs.java

          Added partial results flag. Note, TopFieldDocs extends this class and inherits the new functionality.

          Show
          timmsc Sean Timm added a comment - Here are some additional details on the changes. New files: TimeLimitedCollector.java Extends HitCollector and detects timeouts resulting in a TimeLimitedCollector.TimeExceeded exception being thrown. TimerThread.java TimerThread provides a pseudo-clock service to all searching threads, so that they can count elapsed time with less overhead than repeatedly calling System.currentTimeMillis. A single thread should be created to be used for all searches. Modified Files: Hits.java Added partial result flag. IndexSearcher.java Catches TimeLimitedCollector.TimeExceeded, sets partial results flag on TopDocs and estimates the total hit count (if we hadn't timed out partway through). Returns TopDocs with partial results. Searcher.java Added methods to set and get the timeout parameters. This implementation decision has the limitation of only permitting a single timeout value per Searcher instance (of which there is usually only one). However, this greatly minimizes the number of search methods that would need to be added. In practice, I have not needed the functionality to change the timeout settings on a per query basis. TopFieldDocCollector.java Uses TimeLimitedCollector functionality. TopDocCollector.java Uses TimeLimitedCollector functionality and exposes it to child class TopFieldDocCollector. TopDocs.java Added partial results flag. Note, TopFieldDocs extends this class and inherits the new functionality.
          Hide
          lucenebugs@danielnaber.de Daniel Naber added a comment -

          Thanks for the patch. I didn't have a very close look, just one small thing: it's probably no good idea to catch and ignore the InterruptedException. See http://www-128.ibm.com/developerworks/java/library/j-jtp05236.html

          Show
          lucenebugs@danielnaber.de Daniel Naber added a comment - Thanks for the patch. I didn't have a very close look, just one small thing: it's probably no good idea to catch and ignore the InterruptedException. See http://www-128.ibm.com/developerworks/java/library/j-jtp05236.html
          Hide
          hossman Hoss Man added a comment -

          I'm not entirely convinced it makes sense to modify these classes to include timeouts as core functionality ... would it make more sense to deal with this in subclasses of IndexSearcher/TopDocs/Hits ?

          Show
          hossman Hoss Man added a comment - I'm not entirely convinced it makes sense to modify these classes to include timeouts as core functionality ... would it make more sense to deal with this in subclasses of IndexSearcher/TopDocs/Hits ?
          Hide
          timmsc Sean Timm added a comment -

          http://www-128.ibm.com/developerworks/java/library/j-jtp05236.html

          TimerThread Now follows Brian Goetz's best practice for a noncancelable task that restores interrupted status before returning rather than ignoring the InterruptedException.

          Show
          timmsc Sean Timm added a comment - http://www-128.ibm.com/developerworks/java/library/j-jtp05236.html TimerThread Now follows Brian Goetz's best practice for a noncancelable task that restores interrupted status before returning rather than ignoring the InterruptedException.
          Hide
          timmsc Sean Timm added a comment -

          Two issues are addressed in this latest patch:

          1) Timeout support was not added to: public TopFieldDocs search(Weight weight, Filter filter, final int nDocs, Sort sort)

          2) getCounter() in TimerThread was replaced by getMilliseconds()

          Show
          timmsc Sean Timm added a comment - Two issues are addressed in this latest patch: 1) Timeout support was not added to: public TopFieldDocs search(Weight weight, Filter filter, final int nDocs, Sort sort) 2) getCounter() in TimerThread was replaced by getMilliseconds()
          Hide
          doronc Doron Cohen added a comment -

          I think this is a nice feature to have.

          But I am not sure about the propsed API - the application creates a TimerThread, starts it, and the timer thread is then passed to the searcher with setTimeOut(timer,ticks). Not so simple.

          I think my preference for the API and implementation would be in HitCllector.collect() - in other words, we consider this new feature as an advanced one, and so only allow applications to provide their "timed" hitCollector. The modified collect() would either throw a TimeoutException or return a timedOut indication. If this is a (subclass of) RunTimeException (thuogh I am not crazy about this alternative) then there's no API change (a plus) but we need to verify that the code below propagates the RuntimeException gracefully and closes all the streams and everything (which I believe it does with all last careful changes by Mike and Michael). If RuntimeEXception is not acceptable, then this is an API change (a minus) and also many (simple) changes will be required in scorers (callers to collect).

          The application's timedColletor will have all the logic in that collector for both announcing and detecting the timeout. Next we can add a TimedCollector for the benefit of applications, and last, consider adding search() methods with timeOut, but I doubt that last step.

          Show
          doronc Doron Cohen added a comment - I think this is a nice feature to have. But I am not sure about the propsed API - the application creates a TimerThread, starts it, and the timer thread is then passed to the searcher with setTimeOut(timer,ticks). Not so simple. I think my preference for the API and implementation would be in HitCllector.collect() - in other words, we consider this new feature as an advanced one, and so only allow applications to provide their "timed" hitCollector. The modified collect() would either throw a TimeoutException or return a timedOut indication. If this is a (subclass of) RunTimeException (thuogh I am not crazy about this alternative) then there's no API change (a plus) but we need to verify that the code below propagates the RuntimeException gracefully and closes all the streams and everything (which I believe it does with all last careful changes by Mike and Michael). If RuntimeEXception is not acceptable, then this is an API change (a minus) and also many (simple) changes will be required in scorers (callers to collect). The application's timedColletor will have all the logic in that collector for both announcing and detecting the timeout. Next we can add a TimedCollector for the benefit of applications, and last, consider adding search() methods with timeOut, but I doubt that last step.
          Hide
          yseeley@gmail.com Yonik Seeley added a comment -

          > allow applications to provide their "timed" hitCollector

          +1

          Show
          yseeley@gmail.com Yonik Seeley added a comment - > allow applications to provide their "timed" hitCollector +1
          Hide
          goksron Lance Norskog added a comment -

          I just requested a more fancy feature in the Solr Jira. My apologies, I did not think to search the Lucene Jira.

          1) timeout: stop searching after N milliseconds and return results using only those hits already found
          2) hit limit: stop searching after N milliseconds and return results using only those hits already found
          3) ram limit: estimate the amount of ram used so far and stop searching at a given amount

          Here is the complete request:

          Show
          goksron Lance Norskog added a comment - I just requested a more fancy feature in the Solr Jira. My apologies, I did not think to search the Lucene Jira. 1) timeout: stop searching after N milliseconds and return results using only those hits already found 2) hit limit: stop searching after N milliseconds and return results using only those hits already found 3) ram limit: estimate the amount of ram used so far and stop searching at a given amount Here is the complete request:
          Hide
          goksron Lance Norskog added a comment -

          I stumbled above; I do not yet know Jira The Solr code is SOLR-392.

          This request is inspired by a public search engine with millions of records.
          There are three different aspects mentioned above that can cause a query to "go rogue": timing out, finding too many records to give a truly useable result, and using up too much memory. The point is that if a search is going to find 14 million hits, Google does not go and tally them. It stops quickly and estimates how many might remain. I would like to have similar control.

          The HitCollector implementation mentioned above would allow all three of these control options. If they could be pipelined together we could use any or all of them.

          Show
          goksron Lance Norskog added a comment - I stumbled above; I do not yet know Jira The Solr code is SOLR-392 . This request is inspired by a public search engine with millions of records. There are three different aspects mentioned above that can cause a query to "go rogue": timing out, finding too many records to give a truly useable result, and using up too much memory. The point is that if a search is going to find 14 million hits, Google does not go and tally them. It stops quickly and estimates how many might remain. I would like to have similar control. The HitCollector implementation mentioned above would allow all three of these control options. If they could be pipelined together we could use any or all of them.
          Hide
          timmsc Sean Timm added a comment -

          > I think my preference for the API and implementation would be in HitCollector.collect()

          This would be simpler, but I don't see how it would be possible to estimate the total number of results and return partial results in that case. I think that is an important feature.

          If the concern is complexity for the application, perhaps it is possible to hide the TimerThread altogether. The TimerThread could be created and started via a searcher setTimeOut(tick, numTicks) method.

          To simplify it further, ticks could be fixed at a reasonable number, e.g., 100 ms, and a timeout in milliseconds could be passed in: setTimeout(milliseconds).

          Show
          timmsc Sean Timm added a comment - > I think my preference for the API and implementation would be in HitCollector.collect() This would be simpler, but I don't see how it would be possible to estimate the total number of results and return partial results in that case. I think that is an important feature. If the concern is complexity for the application, perhaps it is possible to hide the TimerThread altogether. The TimerThread could be created and started via a searcher setTimeOut(tick, numTicks) method. To simplify it further, ticks could be fixed at a reasonable number, e.g., 100 ms, and a timeout in milliseconds could be passed in: setTimeout(milliseconds).
          Hide
          doronc Doron Cohen added a comment -

          TimerThread provides a pseudo-clock service to all searching threads,
          so that they can count elapsed time with less overhead than repeatedly
          calling System.currentTimeMillis. A single thread should be created to
          be used for all searches.

          Is this really faster than calling System.currentTimeMillis()?
          I quick searched but found no references supporting this.
          This one says the opposite:
          http://www.devx.com/Java/Article/28685
          Because if this is not the case, you could do without the TimerThread?

          Show
          doronc Doron Cohen added a comment - TimerThread provides a pseudo-clock service to all searching threads, so that they can count elapsed time with less overhead than repeatedly calling System.currentTimeMillis. A single thread should be created to be used for all searches. Is this really faster than calling System.currentTimeMillis()? I quick searched but found no references supporting this. This one says the opposite: http://www.devx.com/Java/Article/28685 Because if this is not the case, you could do without the TimerThread?
          Hide
          mikemccand Michael McCandless added a comment -

          I think one benefit of a dedicated timer thread is not being affected
          by clock shift on the machine. System.currentTimeMillis() is not
          guaranteed to be monotonic. System.nanoTime() (1.5 only!) tries
          to be (I think), but it's still not guaranteed.

          Without a monotonic clock, if the clock shifts forward then it could
          timeout in-flight queries (way) too early.

          But: what happens when TimerThread overflows the int (a
          2*1024*1024*1024)? Is it the caller's job to deal with the
          wraparound?

          Show
          mikemccand Michael McCandless added a comment - I think one benefit of a dedicated timer thread is not being affected by clock shift on the machine. System.currentTimeMillis() is not guaranteed to be monotonic. System.nanoTime() (1.5 only!) tries to be (I think), but it's still not guaranteed. Without a monotonic clock, if the clock shifts forward then it could timeout in-flight queries (way) too early. But: what happens when TimerThread overflows the int (a 2*1024*1024*1024)? Is it the caller's job to deal with the wraparound?
          Hide
          doronc Doron Cohen added a comment -

          Nice, I didn't think of this.

          So with this understanding the timer thread (with long vs int)
          makes sense while in Java 1.4, then in 1.5 System.nanoTime
          will do.

          The suggested patch relied on collect() being called, and
          so if a scorer takes long going over all the posting lists but
          fails to find a single match after the time passed, the search
          operation will not be stopped. I guess it is a fair assumption
          that this would be very rare...
          (so would be a system clock shift... : - ) )

          Also important to understand is what happens with IO
          resources once search is aborted with timeout exception.
          Current patch does not close the underlying streams (I
          mean IndexInput clones). I think this is ok, because
          once the search is aborted and there are no more references
          to the weights&scorers, the IndexInput clones would be
          eventually garbage collected. Others?

          Show
          doronc Doron Cohen added a comment - Nice, I didn't think of this. So with this understanding the timer thread (with long vs int) makes sense while in Java 1.4, then in 1.5 System.nanoTime will do. The suggested patch relied on collect() being called, and so if a scorer takes long going over all the posting lists but fails to find a single match after the time passed, the search operation will not be stopped. I guess it is a fair assumption that this would be very rare... (so would be a system clock shift... : - ) ) Also important to understand is what happens with IO resources once search is aborted with timeout exception. Current patch does not close the underlying streams (I mean IndexInput clones). I think this is ok, because once the search is aborted and there are no more references to the weights&scorers, the IndexInput clones would be eventually garbage collected. Others?
          Hide
          nyh Nadav Har'El added a comment -

          I'd like to add my 2 cents on this issue.

          The more I use Lucene in various ways, the more I'm getting convinced that the "Hits" API should be de-emphasized (if not outright depracated), and users should be encouraged to use the HitCollector API (search() taking a hitcollector, TopDocCollector, and so on) - especially for advanced usage.

          I believe that your TimeLimitedCollector is an interesting idea. However, there is simply no justification to change TopDocCollector, TopFieldDocCollector, Topdocs, Hits and Searcher. There's a MUCH simpler, and in my opinion cleaner, approach: Just make TimeLimitedCollector be a front-end for another collector (for example, TopDocCollector) which will be passed to TimeLimitedCollector's constructor. For every document that is presented to TimeLimitedCollector's collect(), it would call the inner collector's collect().

          This way, without any changes to Lucene's core APIs, and just the addition of a new class, you can add the new functionality that you wanted. This, I think, is the beauty of the HitCollector interface over the "monolithic" Hits approach, and I think we should encourage this way of thinking instead of adding more and more features to Hits.

          Show
          nyh Nadav Har'El added a comment - I'd like to add my 2 cents on this issue. The more I use Lucene in various ways, the more I'm getting convinced that the "Hits" API should be de-emphasized (if not outright depracated), and users should be encouraged to use the HitCollector API (search() taking a hitcollector, TopDocCollector, and so on) - especially for advanced usage. I believe that your TimeLimitedCollector is an interesting idea. However, there is simply no justification to change TopDocCollector, TopFieldDocCollector, Topdocs, Hits and Searcher. There's a MUCH simpler, and in my opinion cleaner, approach: Just make TimeLimitedCollector be a front-end for another collector (for example, TopDocCollector) which will be passed to TimeLimitedCollector's constructor. For every document that is presented to TimeLimitedCollector's collect(), it would call the inner collector's collect(). This way, without any changes to Lucene's core APIs, and just the addition of a new class, you can add the new functionality that you wanted. This, I think, is the beauty of the HitCollector interface over the "monolithic" Hits approach, and I think we should encourage this way of thinking instead of adding more and more features to Hits.
          Hide
          yseeley@gmail.com Yonik Seeley added a comment -

          make TimeLimitedCollector be a front-end for another collector (for example, TopDocCollector) which will be passed to TimeLimitedCollector's constructor.

          +1

          Show
          yseeley@gmail.com Yonik Seeley added a comment - make TimeLimitedCollector be a front-end for another collector (for example, TopDocCollector) which will be passed to TimeLimitedCollector's constructor. +1
          Hide
          doronc Doron Cohen added a comment -

          Sean, can you revise your patch to follow the suggestions above?

          That is, create a TimeLimitedCollector that takes and Collector parameter
          for its constructor. You should be able to hide all the TimerThread
          details (use long instead of int) within the implementation of this
          new collector class, and so when moving to Java5 the TimeThread
          can be replaced by nanoTime.

          Then we can add to either core search or under contrib.

          On a related point - I am not happy with programming by a
          runtimException - if others agree that this should become
          a standard functionality, how about modifying Collector.collect()
          to return a stop-or-continue indicator?

          Show
          doronc Doron Cohen added a comment - Sean, can you revise your patch to follow the suggestions above? That is, create a TimeLimitedCollector that takes and Collector parameter for its constructor. You should be able to hide all the TimerThread details (use long instead of int) within the implementation of this new collector class, and so when moving to Java5 the TimeThread can be replaced by nanoTime. Then we can add to either core search or under contrib. On a related point - I am not happy with programming by a runtimException - if others agree that this should become a standard functionality, how about modifying Collector.collect() to return a stop-or-continue indicator?
          Hide
          timmsc Sean Timm added a comment -

          Thanks for all of the feedback. I'll take another stab at this. I'm on vacation now and probably won't have time until I get back. It'll probably be early Jan. before I have a new patch ready.

          Show
          timmsc Sean Timm added a comment - Thanks for all of the feedback. I'll take another stab at this. I'm on vacation now and probably won't have time until I get back. It'll probably be early Jan. before I have a new patch ready.
          Hide
          tcn Timo Nentwig added a comment -

          IMHO it definitely should be part of the core. Being able to control the runtime of queries/your ressources is crucial for every live system and I really wonder this it has taken so long to address this.

          Otherwise I totally agree with Navdav: that Hits thingie is nice and fine for simple full-text queries but as soon as things become somewhat more complex you don't get around writing your own HitCollector (and do stuff like Facets).

          I also strongly agree that the timeout HC should be implemented as an decorator (what's been called "front-end" here), I just quickly wrote an example and attached it (no, I'm not happy throwing an runtime exception either):

          MyHitCollector hc = new MyHitCollector();
          s.search(q, null, HitCollectorTimeoutDecorator.decorate( hc, 10 ) );

          And finally, why return partial results? I don't think that this is reasonable.

          BTW I'm not sure whether volatile in the timer thread is really working reliably in 1.4...

          Show
          tcn Timo Nentwig added a comment - IMHO it definitely should be part of the core. Being able to control the runtime of queries/your ressources is crucial for every live system and I really wonder this it has taken so long to address this. Otherwise I totally agree with Navdav: that Hits thingie is nice and fine for simple full-text queries but as soon as things become somewhat more complex you don't get around writing your own HitCollector (and do stuff like Facets). I also strongly agree that the timeout HC should be implemented as an decorator (what's been called "front-end" here), I just quickly wrote an example and attached it (no, I'm not happy throwing an runtime exception either): MyHitCollector hc = new MyHitCollector(); s.search(q, null, HitCollectorTimeoutDecorator.decorate( hc, 10 ) ); And finally, why return partial results? I don't think that this is reasonable. BTW I'm not sure whether volatile in the timer thread is really working reliably in 1.4...
          Hide
          tcn Timo Nentwig added a comment -

          Example of the timeout HitCollector implemented as an decorator.

          Show
          tcn Timo Nentwig added a comment - Example of the timeout HitCollector implemented as an decorator.
          Hide
          tcn Timo Nentwig added a comment -

          Example HitCollector to be decorated by HitCollectorTimeoutDecorator.

          Show
          tcn Timo Nentwig added a comment - Example HitCollector to be decorated by HitCollectorTimeoutDecorator.
          Hide
          paul.elschot@xs4all.nl Paul Elschot added a comment -

          By using only a HitCollector it cannot be guaranteed that searches will not take too long. The reason is that there are searches that take a long time but do not collect any docs.

          For example consider the case where every even doc has term A and every odd doc has term B, and the query requires both A and B. This is going to take an amount of time that is linear in the size of the index but no document will be collected.

          This means that every conjunction (boolean queries with more than one required clause, phrase queries, span near queries) will need to check for timeout. Also a HitCollector with a timeout facility will need a way
          to be informed of a timeout when no document is collected.

          Show
          paul.elschot@xs4all.nl Paul Elschot added a comment - By using only a HitCollector it cannot be guaranteed that searches will not take too long. The reason is that there are searches that take a long time but do not collect any docs. For example consider the case where every even doc has term A and every odd doc has term B, and the query requires both A and B. This is going to take an amount of time that is linear in the size of the index but no document will be collected. This means that every conjunction (boolean queries with more than one required clause, phrase queries, span near queries) will need to check for timeout. Also a HitCollector with a timeout facility will need a way to be informed of a timeout when no document is collected.
          Hide
          tcn Timo Nentwig added a comment -

          True, unfortunately, but still better than nothing (->current situation). This approach isn't very precise in matters of timing either. Also, throwing a RuntimeException feels more like a hack than well thought out code...

          I don't know Lucene's code good enough in order to estimate whether it's possible to build a real timeout machanism at all/without changing the API/rewriting a lot of code but it's incredibly important to be able to cancel running queries. You don't want to servers under high load suffering from lucene queries running up to multiple minutes at the same time consuming quite a lot of memory. And it makes no sense either because nobody is waiting so long for results...

          Show
          tcn Timo Nentwig added a comment - True, unfortunately, but still better than nothing (->current situation). This approach isn't very precise in matters of timing either. Also, throwing a RuntimeException feels more like a hack than well thought out code... I don't know Lucene's code good enough in order to estimate whether it's possible to build a real timeout machanism at all/without changing the API/rewriting a lot of code but it's incredibly important to be able to cancel running queries. You don't want to servers under high load suffering from lucene queries running up to multiple minutes at the same time consuming quite a lot of memory. And it makes no sense either because nobody is waiting so long for results...
          Hide
          timmsc Sean Timm added a comment -

          Created a patch using Timo's HitCollectorTimeoutDecorator. This patch just has a few mostly minor changes. The biggest changes are that it uses a long instead of an int for the counter in TimerThread and the TimerThread interval is fixed at 10 ms. It also throws a TimeLimitedCollector.TimeExceeded exception on a timeout.

          Show
          timmsc Sean Timm added a comment - Created a patch using Timo's HitCollectorTimeoutDecorator. This patch just has a few mostly minor changes. The biggest changes are that it uses a long instead of an int for the counter in TimerThread and the TimerThread interval is fixed at 10 ms. It also throws a TimeLimitedCollector.TimeExceeded exception on a timeout.
          Hide
          timmsc Sean Timm added a comment -

          Updated bad patch with good copy.

          Created a patch using Timo's HitCollectorTimeoutDecorator. This patch just has a few mostly minor changes. The biggest changes are that it uses a long instead of an int for the counter in TimerThread and the TimerThread interval is fixed at 10 ms. It also throws a TimeLimitedCollector.TimeExceeded exception on a timeout.

          Show
          timmsc Sean Timm added a comment - Updated bad patch with good copy. Created a patch using Timo's HitCollectorTimeoutDecorator. This patch just has a few mostly minor changes. The biggest changes are that it uses a long instead of an int for the counter in TimerThread and the TimerThread interval is fixed at 10 ms. It also throws a TimeLimitedCollector.TimeExceeded exception on a timeout.
          Hide
          timmsc Sean Timm added a comment -

          Updated to work with latest patch.

          Show
          timmsc Sean Timm added a comment - Updated to work with latest patch.
          Hide
          timmsc Sean Timm added a comment -

          The TimerThreadTest illustrates the accuracy of the TimerThread under load. On my 2GHz Xeon 4 CPU dual core RH AS 4 Linux box, it get a 20% difference between the TimerThread implementation and System.currentTimeMillis() and is independent of load.

          java TimerThreadTest 8
          [...]
          10010 12020 [...]

          With my single core single CPU Windows XP laptop I see a 20% difference at load, but when adding additional threads, I see an increasing difference.

          java TimerThreadTest 0
          [...]
          10000 11819 [...]

          java TimerThreadTest 2
          [...]
          10040 18890 [...]

          Show
          timmsc Sean Timm added a comment - The TimerThreadTest illustrates the accuracy of the TimerThread under load. On my 2GHz Xeon 4 CPU dual core RH AS 4 Linux box, it get a 20% difference between the TimerThread implementation and System.currentTimeMillis() and is independent of load. java TimerThreadTest 8 [...] 10010 12020 [...] With my single core single CPU Windows XP laptop I see a 20% difference at load, but when adding additional threads, I see an increasing difference. java TimerThreadTest 0 [...] 10000 11819 [...] java TimerThreadTest 2 [...] 10040 18890 [...]
          Hide
          ab Andrzej Bialecki added a comment -

          I believe this version of the patch won't work properly unless you add synchronization - writes to long values are non-atomic (Java Language Specification 17.7, as the comment says), that's why Nutch uses an int there. Perhaps using AtomicLong would be an answer, if you really need a long value.

          Show
          ab Andrzej Bialecki added a comment - I believe this version of the patch won't work properly unless you add synchronization - writes to long values are non-atomic (Java Language Specification 17.7, as the comment says), that's why Nutch uses an int there. Perhaps using AtomicLong would be an answer, if you really need a long value.
          Hide
          hibou Nicolas Lalevée added a comment -

          AtomicLong is a Java 1.5 feature, so it doesn't fit either.

          Show
          hibou Nicolas Lalevée added a comment - AtomicLong is a Java 1.5 feature, so it doesn't fit either.
          Hide
          paul.elschot@xs4all.nl Paul Elschot added a comment -

          After LUCENE-584 more work will be needed to get all conjunctions in the same place, but it is a starting point.

          Once all conjunctions are in the same place, it would make sense to add a timeout there.

          Show
          paul.elschot@xs4all.nl Paul Elschot added a comment - After LUCENE-584 more work will be needed to get all conjunctions in the same place, but it is a starting point. Once all conjunctions are in the same place, it would make sense to add a timeout there.
          Hide
          timmsc Sean Timm added a comment -

          Andrzej--

          JLS 17.7 Non-atomic Treatment of double and long

          "Writes and reads of volatile long and double values are always atomic."

          Show
          timmsc Sean Timm added a comment - Andrzej-- JLS 17.7 Non-atomic Treatment of double and long "Writes and reads of volatile long and double values are always atomic."
          Hide
          ab Andrzej Bialecki added a comment -

          Indeed, thanks for the correction - I forgot about the special treatment of volatile values.

          Show
          ab Andrzej Bialecki added a comment - Indeed, thanks for the correction - I forgot about the special treatment of volatile values.
          Hide
          timmsc Sean Timm added a comment -

          You are right that I forgot to change the comment when I changed it from an int to a long though. "* updates to 32-bit-sized variables are atomic" is a pretty pointless comment now.

          Show
          timmsc Sean Timm added a comment - You are right that I forgot to change the comment when I changed it from an int to a long though. "* updates to 32-bit-sized variables are atomic" is a pretty pointless comment now.
          Hide
          tcn Timo Nentwig added a comment -

          @Sean: "The biggest changes are that it uses a long instead of an int for the counter in TimerThread" - didn't I use a volatile long already?

          I hope this will become part of the next release. IMHO this is Priority Major or above...

          Show
          tcn Timo Nentwig added a comment - @Sean: "The biggest changes are that it uses a long instead of an int for the counter in TimerThread" - didn't I use a volatile long already? I hope this will become part of the next release. IMHO this is Priority Major or above...
          Hide
          paul.elschot@xs4all.nl Paul Elschot added a comment -

          In the timeout.patch, instead of this:

          time += resolution;
          

          I'd rather have this:

          time = System.currentTimeMillis();
          

          because all of the wait() methods can become unreliable, especially at high loads.

          With (or without) this change, 100 msecs or even 200 msecs could be used as the
          update frequency instead of the current 10 msecs.

          By computing the time out moment up front, one subtraction can be saved at each document collection. Then only TIMER_THREAD.getMilliseconds() method is needed at document collection time, and the getElapsedMilliseconds() method is superfluous.

          Show
          paul.elschot@xs4all.nl Paul Elschot added a comment - In the timeout.patch, instead of this: time += resolution; I'd rather have this: time = System .currentTimeMillis(); because all of the wait() methods can become unreliable, especially at high loads. With (or without) this change, 100 msecs or even 200 msecs could be used as the update frequency instead of the current 10 msecs. By computing the time out moment up front, one subtraction can be saved at each document collection. Then only TIMER_THREAD.getMilliseconds() method is needed at document collection time, and the getElapsedMilliseconds() method is superfluous.
          Hide
          timmsc Sean Timm added a comment -

          @Timo: "didn't I use a volatile long already?" Indeed. I guess that wasn't a very big change then.

          Show
          timmsc Sean Timm added a comment - @Timo: "didn't I use a volatile long already?" Indeed. I guess that wasn't a very big change then.
          Hide
          timmsc Sean Timm added a comment -

          This is a minor update to timeout.patch which fixes the comment about updates to 32-bit-sized variables being atomic and instead talks about volatile longs, as pointed out by Andrzej. It also computes the time out moment up front to save a subtraction on each document collection as suggested by Paul.

          Show
          timmsc Sean Timm added a comment - This is a minor update to timeout.patch which fixes the comment about updates to 32-bit-sized variables being atomic and instead talks about volatile longs, as pointed out by Andrzej. It also computes the time out moment up front to save a subtraction on each document collection as suggested by Paul.
          Hide
          timmsc Sean Timm added a comment -

          I could go either way on the System.currentTimeMillis() versus a TimerThread issue. I agree nanoTime is the correct implementation when using 1.5.

          It doesn't seem like on Linux running ntp it matters much either way. NTP tries to perform smoothing and makes clock changes slowly over a longer period of time when it can rather than have an abrupt change, but YMMV if your system is having clock issues. On a really overloaded Windows box, the TimerThread implementation will not behave well as demonstrated above. I can't speak to any other platforms.

          Show
          timmsc Sean Timm added a comment - I could go either way on the System.currentTimeMillis() versus a TimerThread issue. I agree nanoTime is the correct implementation when using 1.5. It doesn't seem like on Linux running ntp it matters much either way. NTP tries to perform smoothing and makes clock changes slowly over a longer period of time when it can rather than have an abrupt change, but YMMV if your system is having clock issues. On a really overloaded Windows box, the TimerThread implementation will not behave well as demonstrated above. I can't speak to any other platforms.
          Hide
          paul.elschot@xs4all.nl Paul Elschot added a comment -

          The idea of System.currentTimeMillis() is to guard against misbehaviour of the java wait() method and against unexpected delays because of thread scheduling during the jump back for the loop around the wait() call.
          One way to reduce such misbehaviour under heavy load is by increasing the scheduling priority of the timing thread, but I don't think that is necessary.

          Also System.currentTimeMillis() is obviously correct, whereas timeout += resolution is never more than an assumption about correct wait() behaviour.

          Clock changes by NTP are normally so slow that they don't really matter for query time outs.

          Show
          paul.elschot@xs4all.nl Paul Elschot added a comment - The idea of System.currentTimeMillis() is to guard against misbehaviour of the java wait() method and against unexpected delays because of thread scheduling during the jump back for the loop around the wait() call. One way to reduce such misbehaviour under heavy load is by increasing the scheduling priority of the timing thread, but I don't think that is necessary. Also System.currentTimeMillis() is obviously correct, whereas timeout += resolution is never more than an assumption about correct wait() behaviour. Clock changes by NTP are normally so slow that they don't really matter for query time outs.
          Hide
          doronc Doron Cohen added a comment -

          Sean, can you add a Junit test to timeout.patch?

          I think such test should check (1) search correctness (regardless of timeout), (2) expected timeout behavior, and (3) some sanity test with multiple searching threads. This can also serve as an example for using this new collector.

          Cheers.
          Doron

          Show
          doronc Doron Cohen added a comment - Sean, can you add a Junit test to timeout.patch? I think such test should check (1) search correctness (regardless of timeout), (2) expected timeout behavior, and (3) some sanity test with multiple searching threads. This can also serve as an example for using this new collector. Cheers. Doron
          Hide
          tcn Timo Nentwig added a comment -

          IIRC I did time=System.currentTimeMillis() first but was surprised how slow this method actally is.

          Show
          tcn Timo Nentwig added a comment - IIRC I did time=System.currentTimeMillis() first but was surprised how slow this method actally is.
          Hide
          paul.elschot@xs4all.nl Paul Elschot added a comment -

          Would that still be a problem with a 200ms resolution?

          Show
          paul.elschot@xs4all.nl Paul Elschot added a comment - Would that still be a problem with a 200ms resolution?
          Hide
          timmsc Sean Timm added a comment - - edited

          Patch adds JUnit test cases as suggested by Doron.

          Show
          timmsc Sean Timm added a comment - - edited Patch adds JUnit test cases as suggested by Doron.
          Hide
          tcn Timo Nentwig added a comment -

          200ms? No, probably not. I don't recall what resolution I used in my test but actually the timeout check took more time than the Lucene query...

          Show
          tcn Timo Nentwig added a comment - 200ms? No, probably not. I don't recall what resolution I used in my test but actually the timeout check took more time than the Lucene query...
          Hide
          timmsc Sean Timm added a comment -

          Paul,
          I think that if we were to use System.currentTimeMillis(), we would eschew the TimerThread as Doron suggests in his Dec. 15 comment. I haven't seen any performance issues with System.currentTimeMillis().

          As far as 200ms, I think that is too large of a default resolution (and with the current implementation it is not configurable). With a 200 ms resolution, a query with a 1 second time allowed could timeout in 800 ms, and one with a time allowed of 500 ms could timeout in 300 ms. I think it is much worse to timeout a query early than to timeout late.

          Show
          timmsc Sean Timm added a comment - Paul, I think that if we were to use System.currentTimeMillis(), we would eschew the TimerThread as Doron suggests in his Dec. 15 comment. I haven't seen any performance issues with System.currentTimeMillis(). As far as 200ms, I think that is too large of a default resolution (and with the current implementation it is not configurable). With a 200 ms resolution, a query with a 1 second time allowed could timeout in 800 ms, and one with a time allowed of 500 ms could timeout in 300 ms. I think it is much worse to timeout a query early than to timeout late.
          Hide
          doronc Doron Cohen added a comment -

          Sean thanks for adding the test.

          In the attached I tightened the check of allowed elapsed time until timeout.
          Also added info in the exception, and added ability to modify the resolution - default is 20ms (was 5ms).
          Please let me know what you think.

          As for System.currentTimeMillis() vs. Timer thread - IMHO Mike's comment on 'system clock changes' makes the timer thread favorable.

          I checked this with up to 10,000 threads and with that number the test sometimes fails because it is quite tight on the max elapsed time required comparing to the timeout, so I don't see this is a problem. In the attached N_THREADS = 50 and this number of threads always passes for me.

          If there are no more major concerns I think this is now ready to go in, question is where to - under core o.a.l.search or under contrib (query or misc).
          Others?

          Show
          doronc Doron Cohen added a comment - Sean thanks for adding the test. In the attached I tightened the check of allowed elapsed time until timeout. Also added info in the exception, and added ability to modify the resolution - default is 20ms (was 5ms). Please let me know what you think. As for System.currentTimeMillis() vs. Timer thread - IMHO Mike's comment on 'system clock changes' makes the timer thread favorable. I checked this with up to 10,000 threads and with that number the test sometimes fails because it is quite tight on the max elapsed time required comparing to the timeout, so I don't see this is a problem. In the attached N_THREADS = 50 and this number of threads always passes for me. If there are no more major concerns I think this is now ready to go in, question is where to - under core o.a.l.search or under contrib (query or misc). Others?
          Hide
          timmsc Sean Timm added a comment -

          Doron, your comment for setResolution(long) says "The default timer resolution is 50 milliseconds", however, the default is actually 20 ms (public static final int DEFAULT_RESOLUTION = 20. Other than that, everything looks great.

          Show
          timmsc Sean Timm added a comment - Doron, your comment for setResolution(long) says "The default timer resolution is 50 milliseconds", however, the default is actually 20 ms (public static final int DEFAULT_RESOLUTION = 20 . Other than that, everything looks great.
          Hide
          doronc Doron Cohen added a comment -

          Oh wrote comment that was before I decided to change the default...
          Thanks for catching this.

          Show
          doronc Doron Cohen added a comment - Oh wrote comment that was before I decided to change the default... Thanks for catching this.
          Hide
          doronc Doron Cohen added a comment -

          Attached patch corrects default resolution comment.

          Show
          doronc Doron Cohen added a comment - Attached patch corrects default resolution comment.
          Hide
          timmsc Sean Timm added a comment -

          "If there are no more major concerns I think this is now ready to go in, question is where to - under core o.a.l.search or under contrib (query or misc)."

          My preference would be for core o.a.l.search.

          Show
          timmsc Sean Timm added a comment - "If there are no more major concerns I think this is now ready to go in, question is where to - under core o.a.l.search or under contrib (query or misc)." My preference would be for core o.a.l.search.
          Hide
          tcn Timo Nentwig added a comment -

          I agree, core.

          Show
          tcn Timo Nentwig added a comment - I agree, core.
          Hide
          yseeley@gmail.com Yonik Seeley added a comment -

          > My preference would be for core o.a.l.search.
          +1

          Show
          yseeley@gmail.com Yonik Seeley added a comment - > My preference would be for core o.a.l.search. +1
          Hide
          doronc Doron Cohen added a comment -

          Committed (under core o.a.l.search).
          Thanks Sean!

          Show
          doronc Doron Cohen added a comment - Committed (under core o.a.l.search). Thanks Sean!

            People

            • Assignee:
              doronc Doron Cohen
              Reporter:
              timmsc Sean Timm
            • Votes:
              3 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development