Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.0.1
-
None
-
Irrelevant
-
New, Patch Available
Description
Here's the situation: We have a site with a fair few amount of indexes that we're using MultiSearcher/ParallelMultiSearcher for, but the users can select an arbitrary permutation of indexes to search. For example (contrived, but illustratory): the site has indexes numbered 1 - 10; user A wants to search in all 10; user B wants to search indexes 1, 2 and 3, user C wants to search even-numbered indexes. From Lucene 3.0.1, the only way to do this is to continually instantiate a new MultiSearcher based on every permutation of indexes that a user wants, which is not ideal at all.
What I've done is add a new parameter to all methods in MultiSearcher that use the searchables array (docFreq, search, rewrite and createDocFrequencyMap), a Set<Searchable> which is checked for isEmpty() and contains() for every iteration over the searchables[]. The actual logic has been moved into these methods and the old methods have become overloads that pass a Collections.emptySet() into those methods, so I do not expect there to be a very noticeable performance impact as a result of this modification, if it's measurable at all.
I didn't modify the test for MultiSearcher very much, just enough to illustrate the that subsetting of the search results works, since no other logic has changed. If I need to do more for the testing, let me know and I'll do it.
I've attached the patches for MultiSearcher.java, ParallelMultiSearcher.java and TestMultiSearcher.java.