Solr
  1. Solr
  2. SOLR-2863

Solr 3.4 group.truncate does not work with facet queries

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Later
    • Affects Version/s: 3.4
    • Fix Version/s: None
    • Component/s: search
    • Environment:

      Solr 3.4 on Windows Server 2008.

      Description

      When using Grouping with group.truncate=true, The following simple facet query:
      facet.query=Monitor_id:[380000 TO 400000]

      Doesn't give the same number as the nGroups result for the equivalent filter query:
      fq=Monitor_id:[380000 TO 400000]

      From the Wiki page: 'group.truncate: If true, facet counts are based on the most relevant document of each group matching the query.'

      If I turn off group.truncate then the counts are the same, as I'd expect - but unfortunately I'm only interested in the grouped results.

      Asked this question on the Solr-user mailing list and was told it is likely a bug by: Martijn Groningen.

      I'd be very interested in any workaround for this bug!

        Activity

        Hide
        Martijn van Groningen added a comment -

        I initially thought it was a bug, but I'm doubting if it is. I can't reproduce it now. Can you specify the complete requests that you are sending to Solr?

        Show
        Martijn van Groningen added a comment - I initially thought it was a bug, but I'm doubting if it is. I can't reproduce it now. Can you specify the complete requests that you are sending to Solr?
        Hide
        Ian Grainger added a comment -

        I can reproduce it with a very basic filter/facet query:

        /solr/select?q=:&group.field=Unique_Name&group.truncate=true&group.ngroups=true&group=true&facet=true&facet.query=Monitor_id:[380000%20TO%20400000] = facet-count: 103

        vs.

        /solr/select?q=:&group.field=Unique_Name&group.truncate=true&group.ngroups=true&group=true&facet=true&fq=Monitor_id:[380000%20TO%20400000] = ngroups: 4372

        It only happens if the values are different for each document in the group (hence using an ID - which isn't the one being grouped on) - if they are all the same, truncate works as expected.

        Show
        Ian Grainger added a comment - I can reproduce it with a very basic filter/facet query: /solr/select?q= : &group.field=Unique_Name&group.truncate=true&group.ngroups=true&group=true&facet=true&facet.query=Monitor_id: [380000%20TO%20400000] = facet-count: 103 vs. /solr/select?q= : &group.field=Unique_Name&group.truncate=true&group.ngroups=true&group=true&facet=true&fq=Monitor_id: [380000%20TO%20400000] = ngroups: 4372 It only happens if the values are different for each document in the group (hence using an ID - which isn't the one being grouped on) - if they are all the same, truncate works as expected.
        Hide
        Martijn van Groningen added a comment -

        It only happens if the values are different for each document in the group (hence using an ID - which isn't the one being grouped on)

        What values do you mean? The values of the group field (Unique_Name in your case) or Monitor_id field?

        Show
        Martijn van Groningen added a comment - It only happens if the values are different for each document in the group (hence using an ID - which isn't the one being grouped on) What values do you mean? The values of the group field (Unique_Name in your case) or Monitor_id field?
        Hide
        Ian Grainger added a comment -

        It doesn't work if the values of the query field (Monitor_id) are different on each document in the group (as you'd normally expect in an arbitrary group of documents).

        I have done this with fields that vary with unique_name (ie. if unique_name is the same, their values will match), and truncate works correctly for those fields.

        Show
        Ian Grainger added a comment - It doesn't work if the values of the query field (Monitor_id) are different on each document in the group (as you'd normally expect in an arbitrary group of documents). I have done this with fields that vary with unique_name (ie. if unique_name is the same, their values will match), and truncate works correctly for those fields.
        Hide
        Martijn van Groningen added a comment -

        Ok I get what you mean. I think this is not bug, but rather a missing feature. The type of post grouped faceting you want isn't yet implemented in Solr / Lucene. In LUCENE-3097 I described three different post grouping facet types. The one that fits your need (which I call matrix counts) is not yet implemented.

        The group.truncate only selects for all the groups matching the query the most relevant document and uses that as base for a facet. Any subsequent documents inside a group are not visible during faceting.

        What is unfortunate is that the combination of group.ngroups and fq does work the way you want it and thus this explains the difference in facet count and ngroup count.

        Show
        Martijn van Groningen added a comment - Ok I get what you mean. I think this is not bug, but rather a missing feature. The type of post grouped faceting you want isn't yet implemented in Solr / Lucene. In LUCENE-3097 I described three different post grouping facet types. The one that fits your need (which I call matrix counts) is not yet implemented. The group.truncate only selects for all the groups matching the query the most relevant document and uses that as base for a facet. Any subsequent documents inside a group are not visible during faceting. What is unfortunate is that the combination of group.ngroups and fq does work the way you want it and thus this explains the difference in facet count and ngroup count.
        Hide
        Ian Grainger added a comment -

        So faceting with group.truncate will only look at the content of the first document in the group after the query/sort has been performed?

        To enable what I would expect Solr would need to re-sort the documents in the group for each facet.query? There's no way I can specify the sort inside the facet query to get this to happen?

        Does that mean this doesn't work for field facets either (until 'matrix counts' are done)?

        Show
        Ian Grainger added a comment - So faceting with group.truncate will only look at the content of the first document in the group after the query/sort has been performed? To enable what I would expect Solr would need to re-sort the documents in the group for each facet.query? There's no way I can specify the sort inside the facet query to get this to happen? Does that mean this doesn't work for field facets either (until 'matrix counts' are done)?
        Hide
        Martijn van Groningen added a comment -

        So faceting with group.truncate will only look at the content of the first document in the group after the query/sort has been performed?

        Yes. It will only have the most relevant document. What the most relavant document is depends on your sort.

        To enable what I would expect Solr would need to re-sort the documents in the group for each facet.query? There's no way I can specify the sort inside the facet query to get this to happen?

        There is no work around for this as far as I know. You can't sort inside a facet.query, b/c it only counts.

        Does that mean this doesn't work for field facets either (until 'matrix counts' are done)?

        Until matrix facets have been implemented any facet type will have the same problem.

        Show
        Martijn van Groningen added a comment - So faceting with group.truncate will only look at the content of the first document in the group after the query/sort has been performed? Yes. It will only have the most relevant document. What the most relavant document is depends on your sort. To enable what I would expect Solr would need to re-sort the documents in the group for each facet.query? There's no way I can specify the sort inside the facet query to get this to happen? There is no work around for this as far as I know. You can't sort inside a facet.query, b/c it only counts. Does that mean this doesn't work for field facets either (until 'matrix counts' are done)? Until matrix facets have been implemented any facet type will have the same problem.
        Hide
        Ian Grainger added a comment - - edited

        So LUCENE-3097 is fixed and in (unreleased) trunk of Lucene? How can I get a build of Solr with this fix? (Sorry if that's a dumb question, newbie here.)

        Show
        Ian Grainger added a comment - - edited So LUCENE-3097 is fixed and in (unreleased) trunk of Lucene? How can I get a build of Solr with this fix? (Sorry if that's a dumb question, newbie here.)
        Hide
        Martijn van Groningen added a comment -

        No, it isn't fixed the only thing to do is implement matrix grouped facet counts.

        Show
        Martijn van Groningen added a comment - No, it isn't fixed the only thing to do is implement matrix grouped facet counts.
        Hide
        Ian Grainger added a comment -

        OK so the patch for Lucene doesn't help Solr?

        Show
        Ian Grainger added a comment - OK so the patch for Lucene doesn't help Solr?
        Hide
        Martijn van Groningen added a comment -

        No unfortunately it doesn't help Solr and it also doesn't help lucene. It is on the roadmap for post grouping faceting.

        Show
        Martijn van Groningen added a comment - No unfortunately it doesn't help Solr and it also doesn't help lucene. It is on the roadmap for post grouping faceting.
        Hide
        Martijn van Groningen added a comment -

        In order to fix this LUCENE-3097 needs be resolved first.

        Show
        Martijn van Groningen added a comment - In order to fix this LUCENE-3097 needs be resolved first.

          People

          • Assignee:
            Unassigned
            Reporter:
            Ian Grainger
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development