Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
New
Description
Currently Lucene has TimeLimitingCollector to support time-bound collection and it will throw
TimeExceededException if timeout happens. This only works nicely with the single-thread low-level API from the IndexSearcher. The method signature is –
void search(List<LeafReaderContext> leaves, Weight weight, Collector collector)
The intended use is to always enclose the searcher.search(query, collector) call with a try ... catch and handle the timeout exception. Unfortunately when working with a CollectorManager in the multi-thread search context, the TimeExceededException thrown during collecting one leaf slice will be re-thrown by IndexSearcher without calling CollectorManager's reduce(), even if other slices are successfully collected. The signature
of the search api with CollectorManager is –
<C extends Collector, T> T search(Query query, CollectorManager<C, T> collectorManager)
The good news is that IndexSearcher handles CollectionTerminatedException gracefully by ignoring it. We can either wrap TimeLimitingCollector and throw CollectionTerminatedException when timeout happens or simply replace TimeExceededException with CollectionTerminatedException. In either way, we also need to maintain a flag that indicates if timeout occurred so that the user know it's a partial collection.