Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
New, Patch Available
Description
ToParentBlockJoinCollector#getTopGroups method takes several arguments:
public TopGroups<Integer> getTopGroups(ToParentBlockJoinQuery query, Sort withinGroupSort, int offset, int maxDocsPerGroup, int withinGroupOffset, boolean fillSortFields)
and one of them is maxDocsPerGroup which specifies upper bound of child documents number returned within each group.
ToParentBlockJoinCollector collects and caches all child documents matched by given ToParentBlockJoinQuery in OneGroup objects during search so it is possible to create GroupDocs with all matched child documents instead of part of them bounded by maxDocsPerGroup.
When you specify maxDocsPerGroup new queues(I mean TopScoreDocCollector/TopFieldCollector) will be created for each group with maxDocsPerGroup objects created within each queue which could lead to redundant memory allocation in case of child documents number within group is less than maxDocsPerGroup.
I suppose that there are many cases where you need to get all child documents matched by query so it could be nice to have ability to get top groups with all matched child documents without unnecessary memory allocation.
Possible solution is to pass negative maxDocsPerGroup in case when you need to get all matched child documents within each group and check maxDocsPerGroup value: if it is negative then we need to create queue with size of matched child documents number; otherwise create queue with size equals to maxDocsPerGroup.