I was recently asked about this issue, and when i initially started digging into it got more and more confused.
It seems that fundementally, what happened here is that Umesh initially filled a bug regarding the way the collapse QParser selects the "group head" – but this bug report was based on a missunderstanding about what default behavior of CollapseQParser is when dealing with a sort param (as compared to the older GroupingCOmponent).
There was some key discussiong about this issue on the solr-user mailing list, which did not result in updating the summary/description of this issue, followed by Umesh attaching a patch ettempting to implement some changes in behavior.
I have some thoughts on Umesh's approach, and my own suggestions, but before I get into that i want to make sure the situation is accurately represented in this Jira
First off, some key discussion from the solr-user mailing list circa June 2014 that should really be captured directly in this issue.
In particular these comments from Joel...
So, the question is what is the cost (performance and memory) of having the
CollapsingQParserPlugin choose the group head by using the Solr sort
Keep in mind that the CollapsingQParserPlugin's main design goal is to
provide fast performance when collapsing on a high cardinality field. How
you choose the group head can have a big impact here, both on memory
The function query collapse criteria was added to allow you to come up with
custom formulas for selecting the group head, with little or no impact on
performance and memory. Using Solr's recip() function query it seems like
you could come up with some nice scenarios where two variables could be
used to select the group head. For example:
And this respons from Umesh...
I agree 200 MB per request just for collapsing the search results is huge
but at least it increases linearly with number of sort fields.. For my use
case, I am willing to pay the linear cost specially when I can't combine
the sort fields intelligently into a sort function. Plus it allows me to
sort by String/Text fields also which is a big win.
Based on the total comments regarding this issue, including the email discussion, i've revised the summary & description to make it clear:
- this is a feature request
- that the goal is to expand the options available to users of the collapse QParser by allowing "group head" documents to be selected by more complex sort options