Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 5.0
    • Fix Version/s: 4.6, 5.0
    • Component/s: search
    • Labels:
      None

      Description

      This ticket introduces the CollapsingQParserPlugin

      The CollapsingQParserPlugin is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with ngroups) when the number of distinct groups in the result set is high.

      For example in one performance test, a search with 10 million full results and 1 million collapsed groups:
      Standard grouping with ngroups : 17 seconds.
      CollapsingQParserPlugin: 300 milli-seconds.

      Sample syntax:

      Collapse based on the highest scoring document:

      fq=(!collapse field=<field_name>}
      

      Collapse based on the min value of a numeric field:

      fq={!collapse field=<field_name> min=<field_name>}
      

      Collapse based on the max value of a numeric field:

      fq={!collapse field=<field_name> max=<field_name>}
      

      Collapse with a null policy:

      fq={!collapse field=<field_name> nullPolicy=<null_policy>}
      

      There are three null policies:
      ignore : removes docs with a null value in the collapse field (default).
      expand : treats each doc with a null value in the collapse field as a separate group.
      collapse : collapses all docs with a null value into a single group using either highest score, or min/max.

      The CollapsingQParserPlugin also fully supports the QueryElevationComponent

      Note: The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will be moved to it's own ticket.

      1. SOLR-5027.patch
        10 kB
        Joel Bernstein
      2. SOLR-5027.patch
        13 kB
        Joel Bernstein
      3. SOLR-5027.patch
        26 kB
        Joel Bernstein
      4. SOLR-5027.patch
        27 kB
        Joel Bernstein
      5. SOLR-5027.patch
        31 kB
        Joel Bernstein
      6. SOLR-5027.patch
        37 kB
        Joel Bernstein
      7. SOLR-5027.patch
        60 kB
        Joel Bernstein
      8. SOLR-5027.patch
        60 kB
        Joel Bernstein
      9. SOLR-5027.patch
        61 kB
        Joel Bernstein

        Issue Links

        There are no Sub-Tasks for this issue.

          Activity

          Joel Bernstein created issue -
          Hide
          Joel Bernstein added a comment -

          Initial patch for review.

          Show
          Joel Bernstein added a comment - Initial patch for review.
          Joel Bernstein made changes -
          Field Original Value New Value
          Attachment SOLR-5027.patch [ 12591677 ]
          Joel Bernstein made changes -
          Link This issue relates to SOLR-4465 [ SOLR-4465 ]
          Joel Bernstein made changes -
          Link This issue relates to SOLR-5020 [ SOLR-5020 ]
          Hide
          Otis Gospodnetic added a comment -

          Q: when you refer to collapsing and grouping, are you saying this is an alternative implementation to the current impl for field collapsing/grouping?

          query results cache is commented out in the patch, on purpose?

          Show
          Otis Gospodnetic added a comment - Q: when you refer to collapsing and grouping, are you saying this is an alternative implementation to the current impl for field collapsing/grouping? query results cache is commented out in the patch, on purpose?
          Hide
          Joel Bernstein added a comment -

          This is an alternative for field collapsing. It only collapses the groups. This is the first step of moving grouping into the main search flow. The second step is to create a separate search component that works with the collapsed doclist and expands the groups for a single page. The two combined would be a replacement for the current grouping functionality.

          With this approach to field collapsing the main doclist/docset are collapsed. So there is no concept of ngroups or group facets. The result count and facet counts automatically line up with the collapsed doclist/docset.

          The query result cache was commented out for performance testing only. In later patches I'll leave this out.

          Show
          Joel Bernstein added a comment - This is an alternative for field collapsing. It only collapses the groups. This is the first step of moving grouping into the main search flow. The second step is to create a separate search component that works with the collapsed doclist and expands the groups for a single page. The two combined would be a replacement for the current grouping functionality. With this approach to field collapsing the main doclist/docset are collapsed. So there is no concept of ngroups or group facets. The result count and facet counts automatically line up with the collapsed doclist/docset. The query result cache was commented out for performance testing only. In later patches I'll leave this out.
          Hide
          Otis Gospodnetic added a comment -

          Thanks for the clarification. Will this perform better than current grouping?

          Show
          Otis Gospodnetic added a comment - Thanks for the clarification. Will this perform better than current grouping?
          Hide
          Joel Bernstein added a comment -

          It should perform much better then the combination of ngroups and group facets, which have a pretty steep performance penalty as ngroups rises. More testing is needed though to see how this plays out.

          Show
          Joel Bernstein added a comment - It should perform much better then the combination of ngroups and group facets, which have a pretty steep performance penalty as ngroups rises. More testing is needed though to see how this plays out.
          Hide
          Joel Bernstein added a comment -

          Added small initial test case.

          Show
          Joel Bernstein added a comment - Added small initial test case.
          Joel Bernstein made changes -
          Attachment SOLR-5027.patch [ 12592036 ]
          Joel Bernstein made changes -
          Attachment SOLR-5027.patch [ 12592554 ]
          Joel Bernstein made changes -
          Summary CollapsingQParserPlugin Result Set Collapse and Expand Plugins
          Joel Bernstein made changes -
          Description The CollapsingQParserPlugin is a PostFilter that performs field collapsing.

          This allows field collapsing to be done within the normal search flow.

          Initial syntax:

          fq=(!collapse field=<field_name>}

          All documents in a group will be collapsed to the highest ranking document in the group.



          This ticket introduces two new Solr plugins, the CollapsingQParserPlugin and the ExpandComponent.


          The CollapsingQParserPlugin is a PostFilter that performs field collapsing.

          This allows field collapsing to be done within the normal search flow.

          Initial syntax:

          fq=(!collapse field=<field_name>}

          All documents in a group will be collapsed to the highest ranking document in the group.

          The ExpandComponed is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.




          Joel Bernstein made changes -
          Description This ticket introduces two new Solr plugins, the CollapsingQParserPlugin and the ExpandComponent.


          The CollapsingQParserPlugin is a PostFilter that performs field collapsing.

          This allows field collapsing to be done within the normal search flow.

          Initial syntax:

          fq=(!collapse field=<field_name>}

          All documents in a group will be collapsed to the highest ranking document in the group.

          The ExpandComponed is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.




          This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*.


          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

          This allows field collapsing to be done within the normal search flow.

          Initial syntax:

          fq=(!collapse field=<field_name>}

          All documents in a group will be collapsed to the highest ranking document in the group.

          The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.




          Joel Bernstein made changes -
          Link This issue is related to SOLR-5045 [ SOLR-5045 ]
          Hide
          Simon Endele added a comment -

          What do you mean exactly by "there is no concept of ngroups or group facets"? Does that include that there will be no possibility to return the number of groups, like the request parameter "group.ngroups" currently does?

          Will it still be possible to decide if the faceting is done before/after collapsing, similar to "group.facet"?

          Show
          Simon Endele added a comment - What do you mean exactly by "there is no concept of ngroups or group facets"? Does that include that there will be no possibility to return the number of groups, like the request parameter "group.ngroups" currently does? Will it still be possible to decide if the faceting is done before/after collapsing, similar to "group.facet"?
          Hide
          Joel Bernstein added a comment - - edited

          When the CollapsingQParserPlugin is used, the number of groups is simply numFound. So you get ngroups automatically.

          As for faceting. The initial release of this will only return post collapse facet counts, which is often what people want.

          This ticket though is part of a larger design that includes SOLR-5045, which is designed to have facets and other aggregate functions calculated as PostFilter "Aggregators". This would allow you to use the "cost" PostFilter parameter to choose the order the PostFilter is applied. So in the same query you could have some aggregates/facets calculated pre-collapse and some calculated post collapse.

          One of the central ideas of this design is to allow for pluggable collapse and aggregating functions, through the PostFilter mechanism.

          Show
          Joel Bernstein added a comment - - edited When the CollapsingQParserPlugin is used, the number of groups is simply numFound. So you get ngroups automatically. As for faceting. The initial release of this will only return post collapse facet counts, which is often what people want. This ticket though is part of a larger design that includes SOLR-5045 , which is designed to have facets and other aggregate functions calculated as PostFilter "Aggregators". This would allow you to use the "cost" PostFilter parameter to choose the order the PostFilter is applied. So in the same query you could have some aggregates/facets calculated pre-collapse and some calculated post collapse. One of the central ideas of this design is to allow for pluggable collapse and aggregating functions, through the PostFilter mechanism.
          Hide
          Simon Endele added a comment - - edited

          Sounds good.

          I propose to add an additional parameter "expand.fq" to restrict the expanded documents to a certain filter query.
          Sometimes the complete groups are very large and should only be expanded by one or a few representatives of that group (which can be addressed with a filter query). Other group members that are not hit by the main query are not interesting (at least in the first place).

          Note that this is different from adding a basic filter query, since documents that are hit by the main query but not by expand.fq are kept.
          Example: Group consisting of: representative "A", more group members "B" and "C".
          Query hits "B", group is expanded by "A" (due to expand.fq), but not "C" => Result: "A", "B"
          A filter query before expanding would filter out "B" and thus yield no results for this group.
          A filter query after expanding would filter out "B" and "C" thus keep only "A".

          Is that technically possible? Maybe this is worth a separate issue...

          Show
          Simon Endele added a comment - - edited Sounds good. I propose to add an additional parameter "expand.fq" to restrict the expanded documents to a certain filter query. Sometimes the complete groups are very large and should only be expanded by one or a few representatives of that group (which can be addressed with a filter query). Other group members that are not hit by the main query are not interesting (at least in the first place). Note that this is different from adding a basic filter query, since documents that are hit by the main query but not by expand.fq are kept. Example: Group consisting of: representative "A", more group members "B" and "C". Query hits "B", group is expanded by "A" (due to expand.fq), but not "C" => Result: "A", "B" A filter query before expanding would filter out "B" and thus yield no results for this group. A filter query after expanding would filter out "B" and "C" thus keep only "A". Is that technically possible? Maybe this is worth a separate issue...
          Hide
          Joel Bernstein added a comment -

          Lastest work on the CollapsingQParserPlugin. The ExpandComponent was removed from this patch, in order to first focus on the collapse.

          Show
          Joel Bernstein added a comment - Lastest work on the CollapsingQParserPlugin. The ExpandComponent was removed from this patch, in order to first focus on the collapse.
          Joel Bernstein made changes -
          Attachment SOLR-5027.patch [ 12605336 ]
          Joel Bernstein made changes -
          Description This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*.


          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

          This allows field collapsing to be done within the normal search flow.

          Initial syntax:

          fq=(!collapse field=<field_name>}

          All documents in a group will be collapsed to the highest ranking document in the group.

          The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.




          This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*.


          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

          Collapse based on the highest scoring document:

          <*code*>
          fq=(!collapse field=<field_name>}
          <*code*>

          Collapse based on the min value of a numeric field:
          <*code*>
          fq={!collapse field=<field_name> min=<field_name>}
          <*code*>

          Collapse based on the max value of a numeric field:
          <*code*>
          fq={!collapse field=<field_name> max=<field_name>}
          <*code*>

          Collapse with a null policy:
          <*code*>
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          <*code*>
          There are three null policies:
          ignore : removes values docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with null value into a single group use either highest score, or min/max.









          The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.

          *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket.




          Joel Bernstein made changes -
          Description This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*.


          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

          Collapse based on the highest scoring document:

          <*code*>
          fq=(!collapse field=<field_name>}
          <*code*>

          Collapse based on the min value of a numeric field:
          <*code*>
          fq={!collapse field=<field_name> min=<field_name>}
          <*code*>

          Collapse based on the max value of a numeric field:
          <*code*>
          fq={!collapse field=<field_name> max=<field_name>}
          <*code*>

          Collapse with a null policy:
          <*code*>
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          <*code*>
          There are three null policies:
          ignore : removes values docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with null value into a single group use either highest score, or min/max.









          The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.

          *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket.




          This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*.


          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

          Collapse based on the highest scoring document:

          {*code*}
          fq=(!collapse field=<field_name>}
          <*code*>

          Collapse based on the min value of a numeric field:
          {*code*}
          fq={!collapse field=<field_name> min=<field_name>}
          <*code*>

          Collapse based on the max value of a numeric field:
          {*code*}
          fq={!collapse field=<field_name> max=<field_name>}
          <*code*>

          Collapse with a null policy:
          {*code*}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {*code*}
          There are three null policies:
          ignore : removes values docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with null value into a single group use either highest score, or min/max.









          The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.

          *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket.




          Joel Bernstein made changes -
          Description This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*.


          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

          Collapse based on the highest scoring document:

          {*code*}
          fq=(!collapse field=<field_name>}
          <*code*>

          Collapse based on the min value of a numeric field:
          {*code*}
          fq={!collapse field=<field_name> min=<field_name>}
          <*code*>

          Collapse based on the max value of a numeric field:
          {*code*}
          fq={!collapse field=<field_name> max=<field_name>}
          <*code*>

          Collapse with a null policy:
          {*code*}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {*code*}
          There are three null policies:
          ignore : removes values docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with null value into a single group use either highest score, or min/max.









          The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.

          *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket.




          This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*.


          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

          Collapse based on the highest scoring document:

          {code}
          fq=(!collapse field=<field_name>}
          {code}

          Collapse based on the min value of a numeric field:
          {code}
          fq={!collapse field=<field_name> min=<field_name>}
          {code}

          Collapse based on the max value of a numeric field:
          {code}
          fq={!collapse field=<field_name> max=<field_name>}
          {code}

          Collapse with a null policy:
          {code}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {code}
          There are three null policies:
          ignore : removes values docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with null value into a single group use either highest score, or min/max.









          The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.

          *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket.




          Joel Bernstein made changes -
          Description This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*.


          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

          Collapse based on the highest scoring document:

          {code}
          fq=(!collapse field=<field_name>}
          {code}

          Collapse based on the min value of a numeric field:
          {code}
          fq={!collapse field=<field_name> min=<field_name>}
          {code}

          Collapse based on the max value of a numeric field:
          {code}
          fq={!collapse field=<field_name> max=<field_name>}
          {code}

          Collapse with a null policy:
          {code}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {code}
          There are three null policies:
          ignore : removes values docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with null value into a single group use either highest score, or min/max.









          The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.

          *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket.




          This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*.


          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

          Collapse based on the highest scoring document:

          {code}
          fq=(!collapse field=<field_name>}
          {code}

          Collapse based on the min value of a numeric field:
          {code}
          fq={!collapse field=<field_name> min=<field_name>}
          {code}

          Collapse based on the max value of a numeric field:
          {code}
          fq={!collapse field=<field_name> max=<field_name>}
          {code}

          Collapse with a null policy:
          {code}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {code}
          There are three null policies:
          ignore : removes docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with a null value into a single group using either highest score, or min/max.









          The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.

          *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket.




          Hide
          Joel Bernstein added a comment -

          Simon,

          Your idea sounds good and should be no problem to implement. When I finish up the work on the Expand component I'll work on getting this in.

          Show
          Joel Bernstein added a comment - Simon, Your idea sounds good and should be no problem to implement. When I finish up the work on the Expand component I'll work on getting this in.
          Hide
          Joel Bernstein added a comment -

          Added test cases

          Show
          Joel Bernstein added a comment - Added test cases
          Joel Bernstein made changes -
          Attachment SOLR-5027.patch [ 12606461 ]
          Hide
          Prabha Satya added a comment -

          Hi Joel,
          By the comments above I could make out that collapse plugin would allow us to do aggregations. But I am not sure whether this collapse plugin help me achieve something like this, I would express it in sql language for better understanding.

          Schema:
          =======
          Student id
          subject
          marks

          Query:
          =====
          Select subject,max(marks) from Student group by subject.

          Show
          Prabha Satya added a comment - Hi Joel, By the comments above I could make out that collapse plugin would allow us to do aggregations. But I am not sure whether this collapse plugin help me achieve something like this, I would express it in sql language for better understanding. Schema: ======= Student id subject marks Query: ===== Select subject,max(marks) from Student group by subject.
          Hide
          Joel Bernstein added a comment - - edited

          fq=

          {!collapse field=student max=marks}

          Will be similar to the sql statement that you want. But that is a little deceiving because only max and min are supported, other aggregation functions such as sum() are not. You could pair the CollapsingQParserPlugin with the stats component or in the future SOLR-5045 to do true aggregation.

          Note that solr field collasping/grouping can do similar functionality to this ticket and is available now. This ticket is still under development.

          Show
          Joel Bernstein added a comment - - edited fq= {!collapse field=student max=marks} Will be similar to the sql statement that you want. But that is a little deceiving because only max and min are supported, other aggregation functions such as sum() are not. You could pair the CollapsingQParserPlugin with the stats component or in the future SOLR-5045 to do true aggregation. Note that solr field collasping/grouping can do similar functionality to this ticket and is available now. This ticket is still under development.
          Hide
          Joel Bernstein added a comment -

          Added support for the QueryElevationComponent and test case.

          Show
          Joel Bernstein added a comment - Added support for the QueryElevationComponent and test case.
          Joel Bernstein made changes -
          Attachment SOLR-5027.patch [ 12607620 ]
          Joel Bernstein made changes -
          Summary Result Set Collapse and Expand Plugins Field Collapsing PostFilter
          Joel Bernstein made changes -
          Description This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*.


          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

          Collapse based on the highest scoring document:

          {code}
          fq=(!collapse field=<field_name>}
          {code}

          Collapse based on the min value of a numeric field:
          {code}
          fq={!collapse field=<field_name> min=<field_name>}
          {code}

          Collapse based on the max value of a numeric field:
          {code}
          fq={!collapse field=<field_name> max=<field_name>}
          {code}

          Collapse with a null policy:
          {code}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {code}
          There are three null policies:
          ignore : removes docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with a null value into a single group using either highest score, or min/max.









          The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided.

          Initial syntax:

          expand=true - Turns on the expand component.
          expand.field=<field> - Expands results for this field
          expand.limit=5 - Limits the documents for each expanded group.
          expand.sort=<sort spec> - The sort spec for the expanded documents. Default is score.
          expand.rows=500 - The max number of expanded results to bring back. Default is 500.

          *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket.




          This ticket introduces the *CollapsingQParserPlugin*

          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high.

          For example in one performance test, a search with 10 million full results and 1 million collapsed groups:
          Standard grouping with ngroups : 17 seconds.
          CollapsingQParserPlugin: 300 milli-seconds.

          Sample syntax:

          Collapse based on the highest scoring document:

          {code}
          fq=(!collapse field=<field_name>}
          {code}

          Collapse based on the min value of a numeric field:
          {code}
          fq={!collapse field=<field_name> min=<field_name>}
          {code}

          Collapse based on the max value of a numeric field:
          {code}
          fq={!collapse field=<field_name> max=<field_name>}
          {code}

          Collapse with a null policy:
          {code}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {code}
          There are three null policies:
          ignore : removes docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with a null value into a single group using either highest score, or min/max.

          *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will moved to it's own ticket.




          Joel Bernstein made changes -
          Assignee Joel Bernstein [ joel.bernstein ]
          Joel Bernstein made changes -
          Fix Version/s 4.6 [ 12325000 ]
          Fix Version/s 5.0 [ 12321664 ]
          Joel Bernstein made changes -
          Description This ticket introduces the *CollapsingQParserPlugin*

          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high.

          For example in one performance test, a search with 10 million full results and 1 million collapsed groups:
          Standard grouping with ngroups : 17 seconds.
          CollapsingQParserPlugin: 300 milli-seconds.

          Sample syntax:

          Collapse based on the highest scoring document:

          {code}
          fq=(!collapse field=<field_name>}
          {code}

          Collapse based on the min value of a numeric field:
          {code}
          fq={!collapse field=<field_name> min=<field_name>}
          {code}

          Collapse based on the max value of a numeric field:
          {code}
          fq={!collapse field=<field_name> max=<field_name>}
          {code}

          Collapse with a null policy:
          {code}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {code}
          There are three null policies:
          ignore : removes docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with a null value into a single group using either highest score, or min/max.

          *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will moved to it's own ticket.




          This ticket introduces the *CollapsingQParserPlugin*

          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high.

          For example in one performance test, a search with 10 million full results and 1 million collapsed groups:
          Standard grouping with ngroups : 17 seconds.
          CollapsingQParserPlugin: 300 milli-seconds.

          Sample syntax:

          Collapse based on the highest scoring document:

          {code}
          fq=(!collapse field=<field_name>}
          {code}

          Collapse based on the min value of a numeric field:
          {code}
          fq={!collapse field=<field_name> min=<field_name>}
          {code}

          Collapse based on the max value of a numeric field:
          {code}
          fq={!collapse field=<field_name> max=<field_name>}
          {code}

          Collapse with a null policy:
          {code}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {code}
          There are three null policies:
          ignore : removes docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with a null value into a single group using either highest score, or min/max.

          The CollapsingQParserPlugin also fully supports the QueryElevationComponent

          *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will moved to it's own ticket.




          Joel Bernstein made changes -
          Description This ticket introduces the *CollapsingQParserPlugin*

          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high.

          For example in one performance test, a search with 10 million full results and 1 million collapsed groups:
          Standard grouping with ngroups : 17 seconds.
          CollapsingQParserPlugin: 300 milli-seconds.

          Sample syntax:

          Collapse based on the highest scoring document:

          {code}
          fq=(!collapse field=<field_name>}
          {code}

          Collapse based on the min value of a numeric field:
          {code}
          fq={!collapse field=<field_name> min=<field_name>}
          {code}

          Collapse based on the max value of a numeric field:
          {code}
          fq={!collapse field=<field_name> max=<field_name>}
          {code}

          Collapse with a null policy:
          {code}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {code}
          There are three null policies:
          ignore : removes docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with a null value into a single group using either highest score, or min/max.

          The CollapsingQParserPlugin also fully supports the QueryElevationComponent

          *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will moved to it's own ticket.




          This ticket introduces the *CollapsingQParserPlugin*

          The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high.

          For example in one performance test, a search with 10 million full results and 1 million collapsed groups:
          Standard grouping with ngroups : 17 seconds.
          CollapsingQParserPlugin: 300 milli-seconds.

          Sample syntax:

          Collapse based on the highest scoring document:

          {code}
          fq=(!collapse field=<field_name>}
          {code}

          Collapse based on the min value of a numeric field:
          {code}
          fq={!collapse field=<field_name> min=<field_name>}
          {code}

          Collapse based on the max value of a numeric field:
          {code}
          fq={!collapse field=<field_name> max=<field_name>}
          {code}

          Collapse with a null policy:
          {code}
          fq={!collapse field=<field_name> nullPolicy=<null_policy>}
          {code}
          There are three null policies:
          ignore : removes docs with a null value in the collapse field (default).
          expand : treats each doc with a null value in the collapse field as a separate group.
          collapse : collapses all docs with a null value into a single group using either highest score, or min/max.

          The CollapsingQParserPlugin also fully supports the QueryElevationComponent

          *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will be moved to it's own ticket.




          Hide
          Joel Bernstein added a comment -

          Fixed broken test, added javadoc

          Show
          Joel Bernstein added a comment - Fixed broken test, added javadoc
          Joel Bernstein made changes -
          Attachment SOLR-5027.patch [ 12607680 ]
          Hide
          Joel Bernstein added a comment -

          Patch for 4x and trunk. All tests passing.

          Show
          Joel Bernstein added a comment - Patch for 4x and trunk. All tests passing.
          Joel Bernstein made changes -
          Attachment SOLR-5027.patch [ 12607941 ]
          Joel Bernstein made changes -
          Attachment SOLR-5027.patch [ 12608075 ]
          Hide
          Joel Bernstein added a comment -

          Patch that passes precommit for trunk

          Show
          Joel Bernstein added a comment - Patch that passes precommit for trunk
          Hide
          shruti suri added a comment -

          Hi,

          Can I apply this patch on solr-4.2.1 version?

          Regards
          Shruti

          Show
          shruti suri added a comment - Hi, Can I apply this patch on solr-4.2.1 version? Regards Shruti
          Hide
          Joel Bernstein added a comment -

          Shruti,

          I think it should work on 4.2+. Below that it won't. This patch was tested on 4.4+.

          Joel

          Show
          Joel Bernstein added a comment - Shruti, I think it should work on 4.2+. Below that it won't. This patch was tested on 4.4+. Joel
          Hide
          shruti suri added a comment -

          Hi,

          I am Applying this patch on solr-4.2.1 with below steps

          $ cd <your Solr trunk checkout dir>
          $ svn up
          $ wget https://issues.apache.org/jira/secure/attachment/12607941/SOLR-5027.patch -O - | patch -p0 --dry-run

          But i got some error during the operation

          patching file SOLR-5027.patch
          Hunk #1 FAILED at 39.
          1 out of 1 hunk FAILED – saving rejects to file SOLR-5027.patch.rej
          patching file core/src/test/org/apache/solr/search/TestCollapseQParserPlugin.java
          can't find file to patch at input line 150
          Perhaps you used the wrong -p or --strip option?
          The text leading up to this was:
          --------------------------

          Index: solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java
          ===================================================================
          — solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java (revision 1531006)
          +++ solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java (working copy)
          --------------------------
          File to patch:

          Can you please help me solve this problem?

          Regards
          Shruti

          Show
          shruti suri added a comment - Hi, I am Applying this patch on solr-4.2.1 with below steps $ cd <your Solr trunk checkout dir> $ svn up $ wget https://issues.apache.org/jira/secure/attachment/12607941/SOLR-5027.patch -O - | patch -p0 --dry-run But i got some error during the operation patching file SOLR-5027 .patch Hunk #1 FAILED at 39. 1 out of 1 hunk FAILED – saving rejects to file SOLR-5027 .patch.rej patching file core/src/test/org/apache/solr/search/TestCollapseQParserPlugin.java can't find file to patch at input line 150 Perhaps you used the wrong -p or --strip option? The text leading up to this was: -------------------------- Index: solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java =================================================================== — solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java (revision 1531006) +++ solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java (working copy) -------------------------- File to patch: Can you please help me solve this problem? Regards Shruti
          Hide
          Joel Bernstein added a comment -

          I had problems applying this patch to 4.2.1 as well. The patch is applied from the Lucene/Solr root below. You'll see though that my errors are different. There are two files ivy.xml and QParserPlugin.java that are different enough that the patch did not apply. I'm not sure exactly the issue with your error. I'll see if I can create a patch that will work with 4.2.1.

          $ patch -p0 < SOLR-5027.patch
          patching file solr/core/ivy.xml
          Hunk #1 FAILED at 39.
          1 out of 1 hunk FAILED – saving rejects to file solr/core/ivy.xml.rej
          patching file solr/core/src/test/org/apache/solr/search/TestCollapseQParserPlugin.java
          patching file solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java
          patching file solr/core/src/test-files/solr/collection1/conf/solrconfig-collapseqparser.xml
          patching file solr/core/src/java/org/apache/solr/search/CollapsingQParserPlugin.java
          patching file solr/core/src/java/org/apache/solr/search/QParserPlugin.java
          Hunk #1 FAILED at 51.
          1 out of 1 hunk FAILED – saving rejects to file solr/core/src/java/org/apache/solr/search/QParserPlugin.java.rej

          Show
          Joel Bernstein added a comment - I had problems applying this patch to 4.2.1 as well. The patch is applied from the Lucene/Solr root below. You'll see though that my errors are different. There are two files ivy.xml and QParserPlugin.java that are different enough that the patch did not apply. I'm not sure exactly the issue with your error. I'll see if I can create a patch that will work with 4.2.1. $ patch -p0 < SOLR-5027 .patch patching file solr/core/ivy.xml Hunk #1 FAILED at 39. 1 out of 1 hunk FAILED – saving rejects to file solr/core/ivy.xml.rej patching file solr/core/src/test/org/apache/solr/search/TestCollapseQParserPlugin.java patching file solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java patching file solr/core/src/test-files/solr/collection1/conf/solrconfig-collapseqparser.xml patching file solr/core/src/java/org/apache/solr/search/CollapsingQParserPlugin.java patching file solr/core/src/java/org/apache/solr/search/QParserPlugin.java Hunk #1 FAILED at 51. 1 out of 1 hunk FAILED – saving rejects to file solr/core/src/java/org/apache/solr/search/QParserPlugin.java.rej
          Hide
          Joel Bernstein added a comment -

          Shruti,

          I got the patch applied, but it won't compile on 4.2. You'd also need to apply SOLR-5020 and there is another issue with the TermsEnum interface which I suspect will be harder solve. So I think the best approach would be to upgrade to altleast 4.4. SOLR-5020 is part of 4.5.

          Joel

          Show
          Joel Bernstein added a comment - Shruti, I got the patch applied, but it won't compile on 4.2. You'd also need to apply SOLR-5020 and there is another issue with the TermsEnum interface which I suspect will be harder solve. So I think the best approach would be to upgrade to altleast 4.4. SOLR-5020 is part of 4.5. Joel
          Hide
          ASF subversion and git services added a comment -

          Commit 1535208 from Joel Bernstein in branch 'dev/trunk'
          [ https://svn.apache.org/r1535208 ]

          SOLR-5027 CollapsingQParserPlugin

          Show
          ASF subversion and git services added a comment - Commit 1535208 from Joel Bernstein in branch 'dev/trunk' [ https://svn.apache.org/r1535208 ] SOLR-5027 CollapsingQParserPlugin
          Hide
          ASF subversion and git services added a comment -

          Commit 1535259 from Joel Bernstein in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1535259 ]

          SOLR-5027 CollapsingQParserPlugin

          Show
          ASF subversion and git services added a comment - Commit 1535259 from Joel Bernstein in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1535259 ] SOLR-5027 CollapsingQParserPlugin
          Joel Bernstein made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          ASF subversion and git services added a comment -

          Commit 1535614 from Joel Bernstein in branch 'dev/trunk'
          [ https://svn.apache.org/r1535614 ]

          SOLR-5027: Added error handling to CollapsingQParserPlugin

          Show
          ASF subversion and git services added a comment - Commit 1535614 from Joel Bernstein in branch 'dev/trunk' [ https://svn.apache.org/r1535614 ] SOLR-5027 : Added error handling to CollapsingQParserPlugin
          Hide
          ASF subversion and git services added a comment -

          Commit 1535615 from Joel Bernstein in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1535615 ]

          SOLR-5027: Added error handling to CollapsingQParserPlugin

          Show
          ASF subversion and git services added a comment - Commit 1535615 from Joel Bernstein in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1535615 ] SOLR-5027 : Added error handling to CollapsingQParserPlugin
          Hide
          shruti suri added a comment -

          Hi,

          I am facing some problems while implementing this patch in Solr-4.5. Please tell if this patch can be applied to solr-4.5.

          Regards
          Shruti

          Show
          shruti suri added a comment - Hi, I am facing some problems while implementing this patch in Solr-4.5. Please tell if this patch can be applied to solr-4.5. Regards Shruti
          Hide
          Shawn Heisey added a comment -

          shruti suri, the following commands will use SVN to check out the 4.5.1 source code and apply the commits for this issue to it. You must have subversion (1.7 or later preferred) on your computer already:

          svn co https://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_5_1
          cd lucene_solr_4_5_1/
          svn merge --accept postpone -c 1535208 https://svn.apache.org/repos/asf/lucene/dev/trunk
          svn merge -c 1535614 https://svn.apache.org/repos/asf/lucene/dev/trunk
          

          There is one merge conflict that cannot be automatically resolved, but it is on CHANGES.txt, which is not required for the patch to operate. The '--accept postpone' argument that I have put on the first merge command will cause this to be skipped.

          We strongly recommend running 4.5.1, not 4.5.0, because there are a number of critical bugs that have been fixed.

          Show
          Shawn Heisey added a comment - shruti suri , the following commands will use SVN to check out the 4.5.1 source code and apply the commits for this issue to it. You must have subversion (1.7 or later preferred) on your computer already: svn co https://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_5_1 cd lucene_solr_4_5_1/ svn merge --accept postpone -c 1535208 https://svn.apache.org/repos/asf/lucene/dev/trunk svn merge -c 1535614 https://svn.apache.org/repos/asf/lucene/dev/trunk There is one merge conflict that cannot be automatically resolved, but it is on CHANGES.txt, which is not required for the patch to operate. The '--accept postpone' argument that I have put on the first merge command will cause this to be skipped. We strongly recommend running 4.5.1, not 4.5.0, because there are a number of critical bugs that have been fixed.
          Hide
          shruti suri added a comment -

          Hi,

          I implemented above solution and run following commands

          cd lucene_solr_4_5_1/solr
          ant dist

          I again got some error.

          [ivy:retrieve] http://mirror.netcologne.de/maven2/com/carrotsearch/hppc/$

          {/com.carrotsearch/hppc}/hppc-${/com.carrotsearch/hppc}

          .jar
          [ivy:retrieve] ::::::::::::::::::::::::::::::::::::::::::::::
          [ivy:retrieve] :: UNRESOLVED DEPENDENCIES ::
          [ivy:retrieve] ::::::::::::::::::::::::::::::::::::::::::::::
          [ivy:retrieve] :: com.carrotsearch#hppc;$

          {/com.carrotsearch/hppc}

          : not found
          [ivy:retrieve] ::::::::::::::::::::::::::::::::::::::::::::::
          [ivy:retrieve]
          [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS

          BUILD FAILED
          /lucene_solr_4_5_1/solr/common-build.xml:354: The following error occurred while executing this line:
          /lucene_solr_4_5_1/solr/core/build.xml:55: impossible to resolve dependencies:
          resolve failed - see output for details

          Regards
          shruti

          Show
          shruti suri added a comment - Hi, I implemented above solution and run following commands cd lucene_solr_4_5_1/solr ant dist I again got some error. [ivy:retrieve] http://mirror.netcologne.de/maven2/com/carrotsearch/hppc/$ {/com.carrotsearch/hppc}/hppc-${/com.carrotsearch/hppc} .jar [ivy:retrieve] :::::::::::::::::::::::::::::::::::::::::::::: [ivy:retrieve] :: UNRESOLVED DEPENDENCIES :: [ivy:retrieve] :::::::::::::::::::::::::::::::::::::::::::::: [ivy:retrieve] :: com.carrotsearch#hppc;$ {/com.carrotsearch/hppc} : not found [ivy:retrieve] :::::::::::::::::::::::::::::::::::::::::::::: [ivy:retrieve] [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS BUILD FAILED /lucene_solr_4_5_1/solr/common-build.xml:354: The following error occurred while executing this line: /lucene_solr_4_5_1/solr/core/build.xml:55: impossible to resolve dependencies: resolve failed - see output for details Regards shruti
          Hide
          Shawn Heisey added a comment -

          shruti suri I see that happening too. I never had a chance to actually try building it with the commits merged. I have no idea how to fix problems with ivy.

          The ivy.xml change for hppc that is not working is in branch_4x as well, and that branch compiles. This problem is beyond my skills.

          Show
          Shawn Heisey added a comment - shruti suri I see that happening too. I never had a chance to actually try building it with the commits merged. I have no idea how to fix problems with ivy. The ivy.xml change for hppc that is not working is in branch_4x as well, and that branch compiles. This problem is beyond my skills.
          Hide
          Steve Rowe added a comment - - edited

          shruti suri, the problem is that as of Lucene/Solr 4.6, all ivy.xml versions are pulled from lucene/ivy-versions.properties - see LUCENE-5249 and LUCENE-5257 - but not in the lucene_solr_4_5 branch.

          You can look up the correct ivy.xml version to use in the 4.6 branch, rather than the /com.carrotsearch/hppc thing that's on branch_4x.

          Show
          Steve Rowe added a comment - - edited shruti suri , the problem is that as of Lucene/Solr 4.6, all ivy.xml versions are pulled from lucene/ivy-versions.properties - see LUCENE-5249 and LUCENE-5257 - but not in the lucene_solr_4_5 branch. You can look up the correct ivy.xml version to use in the 4.6 branch, rather than the /com.carrotsearch/hppc thing that's on branch_4x.
          Hide
          Shawn Heisey added a comment - - edited

          Thanks Steve Rowe! shruti suri, if you edit solr/core/ivy.xml after the merge, you can change the /com.carrotsearch/hppc property substitution to 0.5.2 and it should work properly. That was the version I found in branch_4x for the hppc component.

          I was trying to boil it down to a patch, but ran into some problems. Fixing the one line manually is easier.

          Show
          Shawn Heisey added a comment - - edited Thanks Steve Rowe ! shruti suri , if you edit solr/core/ivy.xml after the merge, you can change the /com.carrotsearch/hppc property substitution to 0.5.2 and it should work properly. That was the version I found in branch_4x for the hppc component. I was trying to boil it down to a patch, but ran into some problems. Fixing the one line manually is easier.
          Hide
          shruti suri added a comment -

          Thanks a lot my patch worked..

          Show
          shruti suri added a comment - Thanks a lot my patch worked..
          Brandon Chapman made changes -
          Link This issue is related to SOLR-5408 [ SOLR-5408 ]
          Hide
          David Boychuck added a comment -

          Getting the following error please advise how to fix:

          3095070 [http-bio-8080-exec-8] ERROR org.apache.solr.core.SolrCore – java.lang.NullPointerException
          at org.apache.solr.search.CollapsingQParserPlugin$CollapsingScoreCollector.collect(CollapsingQParserPlugin.java:409)
          at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:910)
          at org.apache.solr.request.SimpleFacets.parseParams(SimpleFacets.java:219)
          at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:549)
          at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:265)
          at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78)
          at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
          at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
          at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
          at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
          at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
          at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
          at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
          at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
          at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
          at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
          at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
          at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
          at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
          at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008)
          at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
          at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
          at java.lang.Thread.run(Thread.java:722)

          13095072 [http-bio-8080-exec-8] ERROR org.apache.solr.servlet.SolrDispatchFilter – null:java.lang.NullPointerException
          at org.apache.solr.search.CollapsingQParserPlugin$CollapsingScoreCollector.collect(CollapsingQParserPlugin.java:409)
          at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:910)
          at org.apache.solr.request.SimpleFacets.parseParams(SimpleFacets.java:219)
          at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:549)
          at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:265)
          at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78)
          at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
          at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
          at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
          at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
          at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
          at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
          at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
          at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
          at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
          at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
          at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
          at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
          at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
          at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008)
          at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
          at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
          at java.lang.Thread.run(Thread.java:722)

          Show
          David Boychuck added a comment - Getting the following error please advise how to fix: 3095070 [http-bio-8080-exec-8] ERROR org.apache.solr.core.SolrCore – java.lang.NullPointerException at org.apache.solr.search.CollapsingQParserPlugin$CollapsingScoreCollector.collect(CollapsingQParserPlugin.java:409) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:910) at org.apache.solr.request.SimpleFacets.parseParams(SimpleFacets.java:219) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:549) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:265) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) 13095072 [http-bio-8080-exec-8] ERROR org.apache.solr.servlet.SolrDispatchFilter – null:java.lang.NullPointerException at org.apache.solr.search.CollapsingQParserPlugin$CollapsingScoreCollector.collect(CollapsingQParserPlugin.java:409) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:910) at org.apache.solr.request.SimpleFacets.parseParams(SimpleFacets.java:219) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:549) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:265) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722)
          Hide
          David Boychuck added a comment -

          Looks like the error is only happening on queries that I use tagging on

          Show
          David Boychuck added a comment - Looks like the error is only happening on queries that I use tagging on
          Hide
          David Boychuck added a comment -

          Here is an example query where i'm getting the error:

          /productQuery?fq=discontinued:false&fq=

          {!tag=manufacturer_string}

          manufacturer_string"delta"%20OR%20"kohler")&fq=siteid:82&sort=score%20desc&facet=true&facet.mincount=1&facet.sort=index&start=0&rows=48&fl=productid,manufacturer,uniqueFinish,uniqueid,productCompositeid,score&facet.query=

          {!ex=onSale}

          onSale:true&facet.query=

          {!ex=rating}rating:[4%20TO%20*]&facet.query={!ex=rating}

          rating:[3%20TO%20*]&facet.query=

          {!ex=rating}rating:[2%20TO%20*]&facet.query={!ex=rating}

          rating:[1%20TO%20*]&facet.query=

          {!ex=MadeinAmerica_boolean}

          MadeinAmerica_boolean:yes&facet.query=

          {!ex=inStock}

          inStock:true&facet.query=

          {!ex=PulloutSpray_string}

          PulloutSpray_string:yes&facet.query=

          {!ex=HandlesIncluded_string}

          HandlesIncluded_string:yes&facet.query=

          {!ex=Electronic_string}

          Electronic_string:yes&facet.query=

          {!ex=FlowRateGPM_numeric}FlowRateGPM_numeric:[0%20TO%201]&facet.query={!ex=FlowRateGPM_numeric}

          FlowRateGPM_numeric:[1%20TO%202]&facet.query=

          {!ex=FlowRateGPM_numeric}FlowRateGPM_numeric:[2%20TO%203]&facet.query={!ex=FlowRateGPM_numeric}

          FlowRateGPM_numeric:[4%20TO%205]&facet.query=

          {!ex=FlowRateGPM_numeric}FlowRateGPM_numeric:[3%20TO%204]&facet.query={!ex=FlowRateGPM_numeric}

          FlowRateGPM_numeric:[5%20TO%20*]&facet.query=

          {!ex=ADA_string}

          ADA_string:yes&facet.query=

          {!ex=WaterSenseCertified_string}

          WaterSenseCertified_string:yes&facet.query=

          {!ex=WaterfallFaucet_boolean}

          WaterfallFaucet_boolean:yes&facet.query=

          {!ex=InstallationAvailable_string}

          InstallationAvailable_string:yes&facet.query=

          {!ex=LowLeadCompliant_string}

          LowLeadCompliant_string:yes&facet.query=

          {!ex=DrainAssemblyIncluded_string}

          DrainAssemblyIncluded_string:yes&facet.query=

          {!ex=EscutcheonIncluded_string}

          EscutcheonIncluded_string:yes&facet.field=NumberOfHandles_numeric&facet.field=pricebook_1_fs&facet.field=SpoutReach_numeric&facet.field=SpoutHeight_numeric&facet.field=FaucetCenters_numeric&facet.field=OverallHeight_numeric&facet.field=FaucetHoles_numeric&facet.field=HandleStyle_string&facet.field=masterFinish_string&facet.field=

          {!ex=manufacturer_string}

          manufacturer_string&facet.field=HandleMaterial_string&facet.field=ValveType_string&facet.field=Theme_string&facet.field=MountingType_string&qt=/productQuery&qf=sku^9.0%20upc^9.1%20keywords_82_txtws^1.9%20uniqueid^9.0%20series^2.8%20productTitle^1.2%20productid^9.0%20manufacturer^4.0%20masterFinish^1.5%20theme^1.1%20categoryNames_82_txt^0.2%20finish^1.4&pf=keywords_82_txtws^2.1%20productTitle^1.5%20manufacturer^4.0%20finish^1.9&bf=linear(popularity_82_i,1,2)^3.0&q.alt=categories_82_is:108503

          Show
          David Boychuck added a comment - Here is an example query where i'm getting the error: /productQuery?fq=discontinued:false&fq= {!tag=manufacturer_string} manufacturer_string "delta"%20OR%20"kohler")&fq=siteid:82&sort=score%20desc&facet=true&facet.mincount=1&facet.sort=index&start=0&rows=48&fl=productid,manufacturer,uniqueFinish,uniqueid,productCompositeid,score&facet.query= {!ex=onSale} onSale:true&facet.query= {!ex=rating}rating: [4%20TO%20*] &facet.query={!ex=rating} rating: [3%20TO%20*] &facet.query= {!ex=rating}rating: [2%20TO%20*] &facet.query={!ex=rating} rating: [1%20TO%20*] &facet.query= {!ex=MadeinAmerica_boolean} MadeinAmerica_boolean:yes&facet.query= {!ex=inStock} inStock:true&facet.query= {!ex=PulloutSpray_string} PulloutSpray_string:yes&facet.query= {!ex=HandlesIncluded_string} HandlesIncluded_string:yes&facet.query= {!ex=Electronic_string} Electronic_string:yes&facet.query= {!ex=FlowRateGPM_numeric}FlowRateGPM_numeric: [0%20TO%201] &facet.query={!ex=FlowRateGPM_numeric} FlowRateGPM_numeric: [1%20TO%202] &facet.query= {!ex=FlowRateGPM_numeric}FlowRateGPM_numeric: [2%20TO%203] &facet.query={!ex=FlowRateGPM_numeric} FlowRateGPM_numeric: [4%20TO%205] &facet.query= {!ex=FlowRateGPM_numeric}FlowRateGPM_numeric: [3%20TO%204] &facet.query={!ex=FlowRateGPM_numeric} FlowRateGPM_numeric: [5%20TO%20*] &facet.query= {!ex=ADA_string} ADA_string:yes&facet.query= {!ex=WaterSenseCertified_string} WaterSenseCertified_string:yes&facet.query= {!ex=WaterfallFaucet_boolean} WaterfallFaucet_boolean:yes&facet.query= {!ex=InstallationAvailable_string} InstallationAvailable_string:yes&facet.query= {!ex=LowLeadCompliant_string} LowLeadCompliant_string:yes&facet.query= {!ex=DrainAssemblyIncluded_string} DrainAssemblyIncluded_string:yes&facet.query= {!ex=EscutcheonIncluded_string} EscutcheonIncluded_string:yes&facet.field=NumberOfHandles_numeric&facet.field=pricebook_1_fs&facet.field=SpoutReach_numeric&facet.field=SpoutHeight_numeric&facet.field=FaucetCenters_numeric&facet.field=OverallHeight_numeric&facet.field=FaucetHoles_numeric&facet.field=HandleStyle_string&facet.field=masterFinish_string&facet.field= {!ex=manufacturer_string} manufacturer_string&facet.field=HandleMaterial_string&facet.field=ValveType_string&facet.field=Theme_string&facet.field=MountingType_string&qt=/productQuery&qf=sku^9.0%20upc^9.1%20keywords_82_txtws^1.9%20uniqueid^9.0%20series^2.8%20productTitle^1.2%20productid^9.0%20manufacturer^4.0%20masterFinish^1.5%20theme^1.1%20categoryNames_82_txt^0.2%20finish^1.4&pf=keywords_82_txtws^2.1%20productTitle^1.5%20manufacturer^4.0%20finish^1.9&bf=linear(popularity_82_i,1,2)^3.0&q.alt=categories_82_is:108503
          Hide
          David Boychuck added a comment -

          When I take the

          {!tag}

          out I don't get the error. It looks like the CollapsingQParserPlugin doesn't work with tagging? Can you confirm?

          Show
          David Boychuck added a comment - When I take the {!tag} out I don't get the error. It looks like the CollapsingQParserPlugin doesn't work with tagging? Can you confirm?
          Show
          David Boychuck added a comment - I have posted this information on Solr User: http://lucene.472066.n3.nabble.com/Error-with-CollapsingQParserPlugin-when-trying-to-use-tagging-td4098709.html
          Hide
          David Boychuck added a comment - - edited

          I created the following unit test in TestCollapseQParserPlugin.java to illustrate the bug:

           ModifiableSolrParams params = new ModifiableSolrParams();
              params.add("q", "*:*");
              params.add("fq", "{!collapse field=group_s}");
              params.add("defType", "edismax");
              params.add("bf", "field(test_ti)");
              params.add("fq","{!tag=test_ti}test_ti:5");
              params.add("facet","true");
              params.add("facet.field","{!ex=test_ti}test_ti");
              assertQ(req(params), "*[count(//doc)=1]", "//doc[./int[@name='test_ti']='5']");
          
          Show
          David Boychuck added a comment - - edited I created the following unit test in TestCollapseQParserPlugin.java to illustrate the bug: ModifiableSolrParams params = new ModifiableSolrParams(); params.add( "q" , "*:*" ); params.add( "fq" , "{!collapse field=group_s}" ); params.add( "defType" , "edismax" ); params.add( "bf" , "field(test_ti)" ); params.add( "fq" , "{!tag=test_ti}test_ti:5" ); params.add( "facet" , " true " ); params.add( "facet.field" , "{!ex=test_ti}test_ti" ); assertQ(req(params), "*[count( //doc)=1]" , "//doc[./ int [@name='test_ti']='5']" );
          David Boychuck made changes -
          Link This issue is broken by SOLR-5416 [ SOLR-5416 ]
          Hide
          Greg Harris added a comment -

          I have a request from a customer on this who would really benefit from this filter – Ability to sort by two fields. I have looked into the code and understand this may not be easily feasible. Just getting it out there.

          Show
          Greg Harris added a comment - I have a request from a customer on this who would really benefit from this filter – Ability to sort by two fields. I have looked into the code and understand this may not be easily feasible. Just getting it out there.
          Hide
          Joel Bernstein added a comment - - edited

          David,

          I was reading your comments while I was away on vacation but my mobile device wasn't playing nicely with the jira site, so I held off on replying until I got back.

          I see the issue that you've reported and I'll be working on it through the jira that you created. I'll be posting to that jira with my thoughts soon.

          Joel

          Show
          Joel Bernstein added a comment - - edited David, I was reading your comments while I was away on vacation but my mobile device wasn't playing nicely with the jira site, so I held off on replying until I got back. I see the issue that you've reported and I'll be working on it through the jira that you created. I'll be posting to that jira with my thoughts soon. Joel
          Hide
          Joel Bernstein added a comment -

          Greg,

          Are you asking for the ability to use full sort spec as the collapse criteria? I believe you are, but I just want to clarify.

          You can currently use the full sort spec now to sort the collasped result set. But only min/max of a numeric field as collapse criteria.

          Joel

          Show
          Joel Bernstein added a comment - Greg, Are you asking for the ability to use full sort spec as the collapse criteria? I believe you are, but I just want to clarify. You can currently use the full sort spec now to sort the collasped result set. But only min/max of a numeric field as collapse criteria. Joel
          Hide
          David Boychuck added a comment -

          Joel,

          I submitted a fix in https://issues.apache.org/jira/browse/SOLR-5416

          Let me know if you think this is problematic.

          Show
          David Boychuck added a comment - Joel, I submitted a fix in https://issues.apache.org/jira/browse/SOLR-5416 Let me know if you think this is problematic.
          Hide
          Gabe Enslein added a comment -

          Hey Joel,

          After reviewing the functionality as an alternative to using ngroups for performance reasons, I have a use case that needs sorting specified in the search to be respected before collapsing. I have documents that can have the same score but when collapsing is performed, this only takes the first received document. Many cases this document is less important or relevant and would normally be lower in results if sorting pre-collapse was respected. Is there a possibility that something could be done to either adapt this functionality to respect sorting before collapsing?

          Show
          Gabe Enslein added a comment - Hey Joel, After reviewing the functionality as an alternative to using ngroups for performance reasons, I have a use case that needs sorting specified in the search to be respected before collapsing. I have documents that can have the same score but when collapsing is performed, this only takes the first received document. Many cases this document is less important or relevant and would normally be lower in results if sorting pre-collapse was respected. Is there a possibility that something could be done to either adapt this functionality to respect sorting before collapsing?
          Hide
          Joel Bernstein added a comment - - edited

          Hi Gabe,

          What I was planning to implement first is group head selection based on min/max value of a function output. After that I was planning to implement group head selection based on a combination of score and one other criteria. I'd like to have function based collapse criteria by the Solr 4.7 release.

          Joel

          Show
          Joel Bernstein added a comment - - edited Hi Gabe, What I was planning to implement first is group head selection based on min/max value of a function output. After that I was planning to implement group head selection based on a combination of score and one other criteria. I'd like to have function based collapse criteria by the Solr 4.7 release. Joel
          Hide
          Trey Grainger added a comment -

          Interesting. I've been playing around with the Collapsing QParser and, because of the reason Gabe mentioned, I can think very few use cases for it in it's current implementation. Specifically, because there is no way to break a tie between multiple documents with the same value (the way sorting does), a search that is sorted by score desc, modifieddt desc (newer documents break the tie) is not possible... it just collapses based upon the first document in the index with the duplicate score. Many of my use cases are even trickier... something like sort by displaypriority desc, score desc, modifieddt desc.

          Just brainstorming here, but if sorting documents before collapsing is not possible (due to where in the code stack the collapsing occurs), then it might be possible to just implement a "sort" function (ValueSource) that gave an ordinal score to each document based upon the position it would occur within all documents. If I understand what you mean when you say "group head selection based upon the min/max of the function", then this would effectively allow collapsing sorted values, because the sort function would return higher values for documents which would sort higher. In that case, the sort function (which could read in the current sort parameter from the search request) could even be the default used by collapsing, since that is probably what user's are expecting to happen (this is consistent with how grouping works, for example).

          Thoughts?

          Show
          Trey Grainger added a comment - Interesting. I've been playing around with the Collapsing QParser and, because of the reason Gabe mentioned, I can think very few use cases for it in it's current implementation. Specifically, because there is no way to break a tie between multiple documents with the same value (the way sorting does), a search that is sorted by score desc, modifieddt desc (newer documents break the tie) is not possible... it just collapses based upon the first document in the index with the duplicate score. Many of my use cases are even trickier... something like sort by displaypriority desc, score desc, modifieddt desc. Just brainstorming here, but if sorting documents before collapsing is not possible (due to where in the code stack the collapsing occurs), then it might be possible to just implement a "sort" function (ValueSource) that gave an ordinal score to each document based upon the position it would occur within all documents. If I understand what you mean when you say "group head selection based upon the min/max of the function", then this would effectively allow collapsing sorted values, because the sort function would return higher values for documents which would sort higher. In that case, the sort function (which could read in the current sort parameter from the search request) could even be the default used by collapsing, since that is probably what user's are expecting to happen (this is consistent with how grouping works, for example). Thoughts?
          Hide
          Trey Grainger added a comment -

          Thinking more about this more, it's probably going to be hard to implement an efficient "sort" ValueSource, as it would probably have to loop through all docs in the index during construction and sort them, caching the sort order for all docs so that it is available later when the value for each document is asked for separately.

          It would probably functionally work, but it seems like there's got to be a better way in the Collapse QParser itself...

          Show
          Trey Grainger added a comment - Thinking more about this more, it's probably going to be hard to implement an efficient "sort" ValueSource, as it would probably have to loop through all docs in the index during construction and sort them, caching the sort order for all docs so that it is available later when the value for each document is asked for separately. It would probably functionally work, but it seems like there's got to be a better way in the Collapse QParser itself...
          Hide
          Joel Bernstein added a comment - - edited

          With ValueSource collapse criteria you will be able to break the tie.

          I'll also need to provide a ValueSource that returns the score of the current document being collapsed. Let's call that function:

          collapseScore()

          When you call this function it simply returns the score of the document being collapsed at that time. You could then have a compound function like this:

          sum(collapseScore(), field(tie_break_field))
          

          And the tie is broken.

          So the syntax would look something like this:

          fq={!collapse field=<field_name> max=sum(collapseScore(), field(x))}
          
          Show
          Joel Bernstein added a comment - - edited With ValueSource collapse criteria you will be able to break the tie. I'll also need to provide a ValueSource that returns the score of the current document being collapsed. Let's call that function: collapseScore() When you call this function it simply returns the score of the document being collapsed at that time. You could then have a compound function like this: sum(collapseScore(), field(tie_break_field)) And the tie is broken. So the syntax would look something like this: fq={!collapse field=<field_name> max=sum(collapseScore(), field(x))}
          Hide
          shruti suri added a comment -

          Hi Joel,

          I am facing some ordering difference in collapse post filter with following queries.
          Query1
          fq=

          {!collapse field=company_id}

          Query2
          fq=

          {!collapse field=comany_id min=price}

          This difference in queries which is the min parameter should only change the id (offering id) and give company_id in same order. But instead the order of company_id changes.
          Please check why the order of company id changes.

          Regards
          Shruti

          Show
          shruti suri added a comment - Hi Joel, I am facing some ordering difference in collapse post filter with following queries. Query1 fq= {!collapse field=company_id} Query2 fq= {!collapse field=comany_id min=price} This difference in queries which is the min parameter should only change the id (offering id) and give company_id in same order. But instead the order of company_id changes. Please check why the order of company id changes. Regards Shruti
          Hide
          Joel Bernstein added a comment - - edited

          Shruti,

          What is your sorting/ranking criteria? Can you post your full query.

          Ordering can change when you add the min=price. The reason is that ordering is done on the collapsed document set. So if the collapsed document set changes, your ordering will likely change. For example the documents with a higher or lower price may have a higher or lower score.

          That being said I wouldn't rule out a bug. But I'll need more examples of how the ordering changed to be sure.

          Show
          Joel Bernstein added a comment - - edited Shruti, What is your sorting/ranking criteria? Can you post your full query. Ordering can change when you add the min=price. The reason is that ordering is done on the collapsed document set. So if the collapsed document set changes, your ordering will likely change. For example the documents with a higher or lower price may have a higher or lower score. That being said I wouldn't rule out a bug. But I'll need more examples of how the ordering changed to be sure.
          Hide
          Deepak Mishra added a comment -

          Hi Joel
          I context with the Shruti's comment. We faced the ordering issue without passing any sorting parameter and same filters in both queries.

          Query1
          fq=

          {!collapse field=company_id}

          Query2
          fq=

          {!collapse field=comany_id min=price}

          Query3
          For debugging Query2, we added score field in fl=score,offering_id,company_id...
          That actually solved the document order issue

          Query4
          But when we passed selective exclude in facet field of Query3, it give document in correct order but with NullPointerException in error and no facet (not the one in SOLR-5416).
          facet.field=

          {!ex="samsung"}

          brand
          fq=

          {!tag="samsung"}

          (brand:"samsung")
          The error is
          NullPointerException at org.apache.solr.search.CollapsingQParserPlugin$FloatValueCollapse.collapse(CollapsingQParserPlugin.java:852)

          Query5
          Removing score from fl in Query 4 removes the error

          Show
          Deepak Mishra added a comment - Hi Joel I context with the Shruti's comment. We faced the ordering issue without passing any sorting parameter and same filters in both queries. Query1 fq= {!collapse field=company_id} Query2 fq= {!collapse field=comany_id min=price} Query3 For debugging Query2, we added score field in fl=score,offering_id,company_id... That actually solved the document order issue Query4 But when we passed selective exclude in facet field of Query3, it give document in correct order but with NullPointerException in error and no facet (not the one in SOLR-5416 ). facet.field= {!ex="samsung"} brand fq= {!tag="samsung"} (brand:"samsung") The error is NullPointerException at org.apache.solr.search.CollapsingQParserPlugin$FloatValueCollapse.collapse(CollapsingQParserPlugin.java:852) Query5 Removing score from fl in Query 4 removes the error
          Hide
          Joel Bernstein added a comment -

          Deepak,

          Can you create a new jira for this. In the description of the ticket please post your entire query and stack trace. I'll see if I can create a test to recreate it.

          Thanks,
          Joel

          Show
          Joel Bernstein added a comment - Deepak, Can you create a new jira for this. In the description of the ticket please post your entire query and stack trace. I'll see if I can create a test to recreate it. Thanks, Joel
          Hide
          Joel Bernstein added a comment - - edited

          Deepak,

          I tested with the CollapsingQParserPlugin in SOLR-5416 and I wasn't able to reproduce the bugs.

          Both the sort ordering seems to be working and the I'm not getting the exception. The test I'm using incorporates:
          collapsing with max=float_field,
          implied ordering by score,
          fl with score,
          faceting with tag and exclude

             params = new ModifiableSolrParams();
             params.add("q", "*:*");
             params.add("fq", "{!collapse field=group_s max=test_tf}");
             params.add("defType", "edismax");
             params.add("bf", "field(id)");
             params.add("fl", "score, id");
             params.add("facet","true");
             params.add("fq", "{!tag=test}term_s:YYYY");
             params.add("facet.field", "{!ex=test}term_s");
          
              assertQ(req(params), "*[count(//doc)=2]",
                  "//result/doc[1]/float[@name='id'][.='5.0']",
                  "//result/doc[2]/float[@name='id'][.='1.0']");
          
          
          Show
          Joel Bernstein added a comment - - edited Deepak, I tested with the CollapsingQParserPlugin in SOLR-5416 and I wasn't able to reproduce the bugs. Both the sort ordering seems to be working and the I'm not getting the exception. The test I'm using incorporates: collapsing with max=float_field, implied ordering by score, fl with score, faceting with tag and exclude params = new ModifiableSolrParams(); params.add( "q" , "*:*" ); params.add( "fq" , "{!collapse field=group_s max=test_tf}" ); params.add( "defType" , "edismax" ); params.add( "bf" , "field(id)" ); params.add( "fl" , "score, id" ); params.add( "facet" , " true " ); params.add( "fq" , "{!tag=test}term_s:YYYY" ); params.add( "facet.field" , "{!ex=test}term_s" ); assertQ(req(params), "*[count( //doc)=2]" , " //result/doc[1]/ float [@name='id'][.='5.0']" , " //result/doc[2]/ float [@name='id'][.='1.0']" );
          Hide
          Deepak Mishra added a comment -

          Joel, I created a new JIRA and attached the queries in SOLR-5554

          Show
          Deepak Mishra added a comment - Joel, I created a new JIRA and attached the queries in SOLR-5554
          Hide
          Deepak Mishra added a comment - - edited

          Joel check the JIRA SOLR-5554 again. I have attached the details to reproduce the error and the error log in FINE mode.

          Show
          Deepak Mishra added a comment - - edited Joel check the JIRA SOLR-5554 again. I have attached the details to reproduce the error and the error log in FINE mode.
          Hide
          Phil John added a comment -

          The one thing this doesn't seem to do, which the current field collapsing solution does, is say how many items there are in each group - which is useful if you want to display the top result, but also have a link saying "X other available". Our use case is collapsing down multiple manifestations of a bibliographic work (i.e. multiple editions of the same work), so with the grouping feature we get a count back of the size of the group and can go "5 other editions also available" and then link to a search on the key we collapsed by.

          Is this planned, or will that come in the more generic aggregation support planned for 5.0?

          Show
          Phil John added a comment - The one thing this doesn't seem to do, which the current field collapsing solution does, is say how many items there are in each group - which is useful if you want to display the top result, but also have a link saying "X other available". Our use case is collapsing down multiple manifestations of a bibliographic work (i.e. multiple editions of the same work), so with the grouping feature we get a count back of the size of the group and can go "5 other editions also available" and then link to a search on the key we collapsed by. Is this planned, or will that come in the more generic aggregation support planned for 5.0?
          Hide
          Joel Bernstein added a comment -

          Phil,

          This functionality is planned. The description in this ticket mentions an ExpandComponent, which will expand the groups for a single page of search results. The planned roadmap for the collapse/expand work is to push out SOLR-5408 and SOLR-5416, which are bug fixes in Solr 4.6.1. Followed by SOLR-5536 (possibly in Solr 4.7), which is an enhancement to the CollapsingQParserPlugin. Then the ExpandComponent is the next planned feature.

          Joel

          Show
          Joel Bernstein added a comment - Phil, This functionality is planned. The description in this ticket mentions an ExpandComponent, which will expand the groups for a single page of search results. The planned roadmap for the collapse/expand work is to push out SOLR-5408 and SOLR-5416 , which are bug fixes in Solr 4.6.1. Followed by SOLR-5536 (possibly in Solr 4.7), which is an enhancement to the CollapsingQParserPlugin. Then the ExpandComponent is the next planned feature. Joel
          Hide
          Phil John added a comment -

          Hi Joel,

          Thanks for the clarification - I wondered if it would be in the expander, but came away a bit confused as to what that would end up doing.

          Nice to know it'll come, and as a workaround we can just query for counts from our DB grouped by collapse key until it lands in trunk.

          Thanks,

          Phil.

          Show
          Phil John added a comment - Hi Joel, Thanks for the clarification - I wondered if it would be in the expander, but came away a bit confused as to what that would end up doing. Nice to know it'll come, and as a workaround we can just query for counts from our DB grouped by collapse key until it lands in trunk. Thanks, Phil.
          Hide
          shruti suri added a comment -

          Hi Joel,
          Can i perform the functionality of group.facet=true with collapsing PostFilter.

          shruti

          Show
          shruti suri added a comment - Hi Joel, Can i perform the functionality of group.facet=true with collapsing PostFilter. shruti
          Hide
          Joel Bernstein added a comment -

          Hi Shruti,

          The collapsing post filter separates collapsing and faceting completely. You have access to all of Solr's faceting capability on the collapsed set. If you need to create a facet on the uncollapsed set for a particular facet, you can use tag/exclude facet functionality to remove the collapsed filter for a specific facet.

          It looks like combining the collapsing postfilter with Solr pivot facets might give you the type of functionality you are looking for.

          Joel

          Show
          Joel Bernstein added a comment - Hi Shruti, The collapsing post filter separates collapsing and faceting completely. You have access to all of Solr's faceting capability on the collapsed set. If you need to create a facet on the uncollapsed set for a particular facet, you can use tag/exclude facet functionality to remove the collapsed filter for a specific facet. It looks like combining the collapsing postfilter with Solr pivot facets might give you the type of functionality you are looking for. Joel
          Hide
          Simon Endele added a comment -

          Hi Joel,

          a similar question to Phil John's one: Is it correct that no equivalent for "group.limit" of the old grouping is/will be available?
          I.e. only one document is returned for each group and the ExpandComponent can be used to get more, right?

          I always thought that the aim of the ExpandComponent is to return additional docs in a sense that these documents were not hit by the query (we wrote a component by ourselves for that based on the old grouping functionality).
          Will that be possible with the ExpandComponent, or will it only be possible to fetch n (or all) documents of each group that were hit and collapsed by the CollapsingQParserPlugin (each only for a single page, of course)?

          See also my question above concerning a filter query for the ExpandComponent.

          Thanks in advance,
          Simon

          Show
          Simon Endele added a comment - Hi Joel, a similar question to Phil John's one: Is it correct that no equivalent for "group.limit" of the old grouping is/will be available? I.e. only one document is returned for each group and the ExpandComponent can be used to get more, right? I always thought that the aim of the ExpandComponent is to return additional docs in a sense that these documents were not hit by the query (we wrote a component by ourselves for that based on the old grouping functionality). Will that be possible with the ExpandComponent, or will it only be possible to fetch n (or all) documents of each group that were hit and collapsed by the CollapsingQParserPlugin (each only for a single page, of course)? See also my question above concerning a filter query for the ExpandComponent. Thanks in advance, Simon
          Hide
          Joel Bernstein added a comment - - edited

          Simon,

          The "limit" on group size will be available in the expand component. You are correct that the CollapsingQParserPlugin only returns the one document per group and the Expand component will bring back the rest.

          The initial functionality of the Expand component will return only group members that hit the query. We can iterate on this design to include a limiting filter query and also change the main query to allow retrieval of group members that were not in the original main query. If time allows I can try to get all this in the initial release but I'm shooting if possible to have this ready for Solr 4.7. Whatever doesn't make it in can be added in future releases.

          The initial implementation of the expand component is being worked on in my GitHub fork:

          https://github.com/joelbernstein2013/heliosearch/tree/expand

          I'll be creating a jira ticket for this soon.

          Joel

          Show
          Joel Bernstein added a comment - - edited Simon, The "limit" on group size will be available in the expand component. You are correct that the CollapsingQParserPlugin only returns the one document per group and the Expand component will bring back the rest. The initial functionality of the Expand component will return only group members that hit the query. We can iterate on this design to include a limiting filter query and also change the main query to allow retrieval of group members that were not in the original main query. If time allows I can try to get all this in the initial release but I'm shooting if possible to have this ready for Solr 4.7. Whatever doesn't make it in can be added in future releases. The initial implementation of the expand component is being worked on in my GitHub fork: https://github.com/joelbernstein2013/heliosearch/tree/expand I'll be creating a jira ticket for this soon. Joel
          Joel Bernstein made changes -
          Link This issue relates to SOLR-5720 [ SOLR-5720 ]
          Hide
          Joel Bernstein added a comment -

          The ExpandComponent jira ticket is SOLR-5720.

          Show
          Joel Bernstein added a comment - The ExpandComponent jira ticket is SOLR-5720 .

            People

            • Assignee:
              Joel Bernstein
              Reporter:
              Joel Bernstein
            • Votes:
              3 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development