Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
None
-
None
-
New
Description
FilterDirectoryReader extends BaseCompositeReader, which computes both maxDoc and numDocs eagerly in its constructor by summing up these values across all sub leaves.
This is problematic for readers that hide additional documents. Computing numDocs on such leaf readers usually requires iterating over all live documents to count them. This makes creating a FilterDirectoryReader on top run in linear time, which has caused several performance bugs to us over time. This is especially frustrating given that numDocs is a rarely used index statistic.
I think computing numDocs lazily would be less surprising?
Attachments
Issue Links
- links to