The current implementation of ParallelCompositeReader reassembles the whole subreader structure of the wrapped reader with ParallelLeafReader and ParallelCompositeReader.
This leads to bugs like described in
LUCENE-6500. This reaches back to the time when this reader was reimplemented for the first time shortly before release of 4.0. Shortly afterwards, we completely changed our search infrastructure to just call leaves() and working with them. The method getSequentialSubReaders was made protected, just to be implemented by subclasses (like this one). But no external code can ever call it. Also the search API just rely on the baseId in relation to the top-level reader (to correctly present document ids). The structure is completely unimportant.
This issue will therefore simplify ParallelCompositeReader to just fetch all LeafReaders and build a flat structure of ParallelLeafReaders from it. This also has the nice side-effect, that only the parallel leaf readers must be equally sized, not their structure.
This issue will solve
LUCENE-6500 as a side effect. I just opened a new issue for discussion and to have this listed as "feature" and not bug.
In general, we could also hide the ParallelLeafReader class and make it an implementation detail. ParallelCompositeReader would be the only entry point -> because people could pass any IndexReader structure in, a single AtomicReader would just produce a CompositeReader with one leaf. We could then also rename it back to ParallelReader (like it was in pre Lucene4).