Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Invalid
    • Affects Version/s: 1.4
    • Fix Version/s: 1.4
    • Component/s: clients - java, search
    • Labels:
      None

      Description

      There is a difference in the way Facet results are reported in SolrJ from the REST interface. In REST, if you apply a facet via the fq param, no matter what the count it is always reported back in the list of facets in the responses. However, with SolrJ - it only reports back facets that don't match the total number of documents. This is quite frustrating to deal with.

      The difference can be seen when ORing or ANDing in the fq param. When I or to facet values together, they come back in SolrJ since their counts don't match the total docs. But if I AND them together, they don't appear in the list. So then I need to munge in the applied fq values.

      Why the difference in behavior between REST and SolrJ?

        Activity

        Andrew Nagy created issue -
        Hide
        Ryan McKinley added a comment -

        There is no difference between the results in solrj and directly querying solr (solrj makes the same calls to solr)

        Your issues is probably related to facet.mincount=0 vs facet.mincount=1.
        check:
        http://wiki.apache.org/solr/SimpleFacetParameters

        it that does not fix things, ask on solr-user@lucene...

        Show
        Ryan McKinley added a comment - There is no difference between the results in solrj and directly querying solr (solrj makes the same calls to solr) Your issues is probably related to facet.mincount=0 vs facet.mincount=1. check: http://wiki.apache.org/solr/SimpleFacetParameters it that does not fix things, ask on solr-user@lucene...
        Ryan McKinley made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Invalid [ 6 ]
        Hide
        Andrew Nagy added a comment -

        I am referring to the code in SolrJ that filters out any facets with the same count as the total number of search results. This is different from the REST interface.

        See the bottom of:
        http://svn.apache.org/viewvc/lucene/solr/trunk/src/solrj/org/apache/solr/client/solrj/response/FacetField.java?revision=724175&view=markup

        Solrj filters out facet values where as the raw output from the REST interface does not.

        Show
        Andrew Nagy added a comment - I am referring to the code in SolrJ that filters out any facets with the same count as the total number of search results. This is different from the REST interface. See the bottom of: http://svn.apache.org/viewvc/lucene/solr/trunk/src/solrj/org/apache/solr/client/solrj/response/FacetField.java?revision=724175&view=markup Solrj filters out facet values where as the raw output from the REST interface does not.
        Andrew Nagy made changes -
        Resolution Invalid [ 6 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Hide
        Andrew Nagy added a comment -

        It might also be beneficial to have a list of "Applied Facets" so we can differentiate whether a facet has been applied and what the count is for that facet.

        Show
        Andrew Nagy added a comment - It might also be beneficial to have a list of "Applied Facets" so we can differentiate whether a facet has been applied and what the count is for that facet.
        Hide
        Jayson Minard added a comment -

        I see what Andrew is talking about.

        The difference is this:

        Solrj returns a list of further refinements, so it strips out anything that would not further reduce the set. It says, if the facet count == the total document count, don't return it in the list of further refinements.

        Whereas the raw response obviously has not yet stripped it.

        So Solrj could probably use a list of facets that are not further refinements if it is interesting for users of the API to have those. They don't make sense as possible new refinements but they do make sense in that they exist and some people care that they exist in the resulting data.

        Show
        Jayson Minard added a comment - I see what Andrew is talking about. The difference is this: Solrj returns a list of further refinements, so it strips out anything that would not further reduce the set. It says, if the facet count == the total document count, don't return it in the list of further refinements. Whereas the raw response obviously has not yet stripped it. So Solrj could probably use a list of facets that are not further refinements if it is interesting for users of the API to have those. They don't make sense as possible new refinements but they do make sense in that they exist and some people care that they exist in the resulting data.
        Hide
        Ryan McKinley added a comment -

        Are you talking about:

           public FacetField getLimitingFields(long max) 
        

        If so, that is just a utility function that lets you filter out facets that have fewer options that some number (typically getNumFound)

        If you don't want filtered results, just that FacetField directly and call:

           public List<Count> getValues();
        
        Show
        Ryan McKinley added a comment - Are you talking about: public FacetField getLimitingFields( long max) If so, that is just a utility function that lets you filter out facets that have fewer options that some number (typically getNumFound) If you don't want filtered results, just that FacetField directly and call: public List<Count> getValues();
        Hide
        Ryan McKinley added a comment -
        It might also be beneficial to have a list of "Applied Facets"

        I don't follow – as is, every facet that is applied creates a section in the response (not to mention that you put it in the request). What am i missing?

        Are you talking about fq=?

        Show
        Ryan McKinley added a comment - It might also be beneficial to have a list of "Applied Facets" I don't follow – as is, every facet that is applied creates a section in the response (not to mention that you put it in the request). What am i missing? Are you talking about fq=?
        Hide
        Ryan McKinley added a comment -

        actually, perhaps you are using:

        QueryResponse#getFacetFields()
        vs
        QueryResponse#getLimitingFacets();

        The first returns all results directly from solr, the second returns facets that will reduce the number of documents returned.

        Show
        Ryan McKinley added a comment - actually, perhaps you are using: QueryResponse#getFacetFields() vs QueryResponse#getLimitingFacets(); The first returns all results directly from solr, the second returns facets that will reduce the number of documents returned.
        Hide
        Jayson Minard added a comment -

        erk. Guess I got caught up in the fun and games... I always thing of getFacetFields() as the list of facet fields passed in as the requested set of facets, not the response. Don't know where my brain was at.

        Guessing that solves Andrew's problem and this should be closed.

        Andrew?

        Show
        Jayson Minard added a comment - erk. Guess I got caught up in the fun and games... I always thing of getFacetFields() as the list of facet fields passed in as the requested set of facets, not the response. Don't know where my brain was at. Guessing that solves Andrew's problem and this should be closed. Andrew?
        Hide
        Andrew Nagy added a comment -

        Yes - thanks - this does solve my problem.

        In regards to the "Applied Facets" it might be nice to separate the list of returned facets to a list of available facets and a list of "applied" facets.

        What is driving this idea is my curiosity on how to build a list of facets that uses checkboxes instead of links. How do I know if the checkboxes should be checked or not with out saving some state information about what the user clicked on. It would be nice if solr could do this for me by either flagging the facets that are in my fq or by keeping them in a different list.

        Show
        Andrew Nagy added a comment - Yes - thanks - this does solve my problem. In regards to the "Applied Facets" it might be nice to separate the list of returned facets to a list of available facets and a list of "applied" facets. What is driving this idea is my curiosity on how to build a list of facets that uses checkboxes instead of links. How do I know if the checkboxes should be checked or not with out saving some state information about what the user clicked on. It would be nice if solr could do this for me by either flagging the facets that are in my fq or by keeping them in a different list.
        Andrew Nagy made changes -
        Resolution Invalid [ 6 ]
        Status Reopened [ 4 ] Resolved [ 5 ]
        Hide
        Ryan McKinley added a comment -

        You may also want to check out SOLR-911 or SOLR-792 – solrj does not deal with this functionality yet. Contributions are welcome!

        Show
        Ryan McKinley added a comment - You may also want to check out SOLR-911 or SOLR-792 – solrj does not deal with this functionality yet. Contributions are welcome!
        Hide
        Grant Ingersoll added a comment -

        Bulk close for Solr 1.4

        Show
        Grant Ingersoll added a comment - Bulk close for Solr 1.4
        Grant Ingersoll made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Andrew Nagy
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development