Thanks Emmanuel Keller: this is an impressive change!
Can you add a minimal javadocs to ParallelDrillSideways, and include @lucene.experimental?
Can you fix the indent to 2 spaces, and change your IDE to not use
wildcard imports? (Most of the new classes seem to do so, but at
least one didn't). Or we can fix this up before pushing...
Should CallableCollector be renamed to CallableCollectorManager?
I assume you're using this for your QWAZR search server built on lucene (https://github.com/qwazr/QWAZR)? Thank you for giving back!
There are quite a few new abstractions here,
MultiCollectorManager, FacetsCollectorManager; must they be
public? Can you explain what they do?
It seems like this change opens up concurrency in 2 ways; the first
way is it uses the IndexSearcher.search API that takes a
CollectorManager such that if you had created that
IndexSearcher with an executor, you get concurrency across the
segments in the index. In general I'm not a huge fan of this
concurrency since you are at the whim of how the segments are
structured, and, confusingly, running forceMerge(1) on your index
removes all concurrency. But it's better than nothing: progress not
The second way is that the new ParallelDrillSideways takes its own
executor and then runs the N DrillDown queries concurrently (to
compute the sideways counts), which is very different from the current
doc-at-a-time computation. Have you compared the performance, using a
single thread? ... I'm curious how "doc at a time" vs "query at a
time" (which is also Solr's approach) compare. But, still, the fact
that this "query at a time" approach enables concurrency is a big win.
I wonder if we could absorb ParallelDrillSideways under
DrillSideways such that if you pass an executor it uses the
concurrent implementation? It's really an implementation/execution
detail I think? Similar to how IndexSearcher takes an optional