Solr
  1. Solr
  2. SOLR-1351

facet on same field different ways

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.3
    • Component/s: None
    • Labels:
      None

      Description

      There is a general need to facet on the same field in different ways (different prefixes, different filters). We need a way to express this.

      1. SOLR-1351-facet-same-field.patch
        10 kB
        Ryan McKinley
      2. SOLR-1351-B.patch
        13 kB
        Robert Purdy
      3. SOLR-1351.patch
        12 kB
        Uri Boness

        Issue Links

          Activity

          Hide
          Uri Boness added a comment -

          This is something that I've done in the far past (Solr 1.2) and they way I see it, facets should be identified by a unique idea rather than by the field name and the facet results will then be grouped by these ids. I think this can be done by just adding one extra parameter in the form:

          f.<fieldName>.facet.id
          

          This parameter will practically mean that all other specific parameter for field facet will need to use this id instead of the field name, that is:

          Assuming we have a field called "cat" to represent a category. Right now (without an id) we ca do:

          q=*:*&facet=true&facet.field=cat&f.cat.facet.sort=true&f.cat.facet.limit=20&f.cat.facet.mincount=1
          

          with introducing the id:

          q=*:*&facet=true&facet.field=cat&f.cat.facet.id=category&f.category.facet.sort=true&f.category.facet.limit=20&f.category.facet.mincount=1
          

          Now to support multiple "configurations":

          q=*:*&facet=true&facet.field=cat&f.cat.facet.id=cat1&f.cat1.facet.sort=true&f.cat1.facet.limit=20&f.cat1&facet.mincount=1&f.cat.facet.id=cat2&f.cat2.facet.sort=false&f.cat2.facet.count=0
          

          Note that even after introducing the id param, backward compatibility can easily be maintained - we just determine that when the id param is not specified, the field name is the default id.

          From experience, I can tell you that adding this feature not only will enable multiple facets on the same field, but IMO will also make it much easier to develop search clients and tools on top of Solr.

          If this solution sounds reasonable, I can start working on a patch for it.

          Show
          Uri Boness added a comment - This is something that I've done in the far past (Solr 1.2) and they way I see it, facets should be identified by a unique idea rather than by the field name and the facet results will then be grouped by these ids. I think this can be done by just adding one extra parameter in the form: f.<fieldName>.facet.id This parameter will practically mean that all other specific parameter for field facet will need to use this id instead of the field name, that is: Assuming we have a field called "cat" to represent a category. Right now (without an id) we ca do: q=*:*&facet= true &facet.field=cat&f.cat.facet.sort= true &f.cat.facet.limit=20&f.cat.facet.mincount=1 with introducing the id: q=*:*&facet= true &facet.field=cat&f.cat.facet.id=category&f.category.facet.sort= true &f.category.facet.limit=20&f.category.facet.mincount=1 Now to support multiple "configurations": q=*:*&facet= true &facet.field=cat&f.cat.facet.id=cat1&f.cat1.facet.sort= true &f.cat1.facet.limit=20&f.cat1&facet.mincount=1&f.cat.facet.id=cat2&f.cat2.facet.sort= false &f.cat2.facet.count=0 Note that even after introducing the id param, backward compatibility can easily be maintained - we just determine that when the id param is not specified, the field name is the default id. From experience, I can tell you that adding this feature not only will enable multiple facets on the same field, but IMO will also make it much easier to develop search clients and tools on top of Solr. If this solution sounds reasonable, I can start working on a patch for it.
          Hide
          Uri Boness added a comment -

          Another option is to define the id as a local param:

          q=*:*&facet=true&facet.field={!id=category}cat&f.category.facet.sort=true&f.category.facet.limit=20&f.category.facet.mincount=1
          

          and for multiple configurations:

          q=*:*&facet=true&facet.field={!id=cat1}cat&f.cat1.facet.sort=true&f.cat1.facet.limit=20&f.cat1&facet.mincount=1&facet.field={!id=cat2}cat&f.cat2.facet.sort=false&f.cat2.facet.count=0
          

          I guess it plays nicer with the new functionality in 1.4

          Show
          Uri Boness added a comment - Another option is to define the id as a local param: q=*:*&facet= true &facet.field={!id=category}cat&f.category.facet.sort= true &f.category.facet.limit=20&f.category.facet.mincount=1 and for multiple configurations: q=*:*&facet= true &facet.field={!id=cat1}cat&f.cat1.facet.sort= true &f.cat1.facet.limit=20&f.cat1&facet.mincount=1&facet.field={!id=cat2}cat&f.cat2.facet.sort= false &f.cat2.facet.count=0 I guess it plays nicer with the new functionality in 1.4
          Hide
          Uri Boness added a comment -

          Took the approach as described above. The only difference is that instead of the "id" parameter I reused the "key" parameter already supported by this component. The idea is that now, when the "key" local param is specified, all the specific facet params need to use the key instead of the field name.

          q=*:*&facet=true&facet.field={!key=cat1}cat&f.cat1.facet.sort=true&f.cat1.facet.limit=20&f.cat1&
          facet.mincount=1&facet.field={!key=cat2}cat&f.cat2.facet.sort=false&f.cat2.facet.count=0
          

          This not only applies to simple filed facets but also to date facets:

          q=*:*&facet=true&facet.date={!key=foo}bday&f.foo.facet.date.start=1976-07-01T00:00:00.000Z&
          f.foo.facet.date.end=1976-07-01T00:00:00.000Z+1MONTH&f.foo.facet.date.gap=+1DAY&
          f.foo.facet.date.other=all&facet.date={!key=bar}bday&
          f.bar.facet.date.end=1976-07-01T00:00:00.000Z+7DAY&f.bar.facet.date.gap=+1DAY&
          
          Show
          Uri Boness added a comment - Took the approach as described above. The only difference is that instead of the "id" parameter I reused the "key" parameter already supported by this component. The idea is that now, when the "key" local param is specified, all the specific facet params need to use the key instead of the field name. q=*:*&facet= true &facet.field={!key=cat1}cat&f.cat1.facet.sort= true &f.cat1.facet.limit=20&f.cat1& facet.mincount=1&facet.field={!key=cat2}cat&f.cat2.facet.sort= false &f.cat2.facet.count=0 This not only applies to simple filed facets but also to date facets: q=*:*&facet= true &facet.date={!key=foo}bday&f.foo.facet.date.start=1976-07-01T00:00:00.000Z& f.foo.facet.date.end=1976-07-01T00:00:00.000Z+1MONTH&f.foo.facet.date.gap=+1DAY& f.foo.facet.date.other=all&facet.date={!key=bar}bday& f.bar.facet.date.end=1976-07-01T00:00:00.000Z+7DAY&f.bar.facet.date.gap=+1DAY&
          Hide
          Hoss Man added a comment -

          Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

          http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

          Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

          A unique token for finding these 240 issues in the future: hossversioncleanup20100527

          Show
          Hoss Man added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
          Hide
          Robert Purdy added a comment -

          Patch does not work with current nightly in solr 3.2, solr 4.0 and with latest stable release 1.4.1. Has this been integrated into the current version(s)? If not is there an updated patch somewhere?

          Thanks Robert.

          Show
          Robert Purdy added a comment - Patch does not work with current nightly in solr 3.2, solr 4.0 and with latest stable release 1.4.1. Has this been integrated into the current version(s)? If not is there an updated patch somewhere? Thanks Robert.
          Hide
          Hoss Man added a comment -

          realized we had some duplicate issues.

          note that the description/comments in SOLR-2251 have some good examples of usecases we should definitely make sure we support / test for

          Show
          Hoss Man added a comment - realized we had some duplicate issues. note that the description/comments in SOLR-2251 have some good examples of usecases we should definitely make sure we support / test for
          Hide
          Robert Muir added a comment -

          Bulk move 3.2 -> 3.3

          Show
          Robert Muir added a comment - Bulk move 3.2 -> 3.3
          Hide
          Robert Muir added a comment -

          3.4 -> 3.5

          Show
          Robert Muir added a comment - 3.4 -> 3.5
          Hide
          Hoss Man added a comment -

          Someone asked on the dev list about resurrecting Uri's patch...

          reviewing it now, i suspect that getting it to apply to the trunk (to work as Uri originally wrote it) would just be a simple matter of tweaking the paths and line numbers to get it to apply cleanly.

          There are a few other things that I think would need to be addressed before i'd be comfortable committing however...

          1) need to make sure range faceting is also supported

          2) need a test to verify that things work as expected with distributed searches – some of the params (facet.limit, facet.mincount, facet.offset) have special handling in FacetComponent that may need tweaked to work properly when they are specified on a "key" instead of on a true field name.

          3) need to make sure some precedence rules like those described in SOLR-2251 work, are tested, and documented. The main issue here is that if someone is already using the "key" local param for the purposes of filter exclusion, the precedence of per-field overrides on things like "facet.limit" should still apply and not just be ignored because of "per-key" overrides.

          ie: a request like this in Solr 3.4...

          facet.limit=10&facet.field=foo&facet.field={!key=bar ex=dt}foo&f.foo.facet.limit=100

          ...causes both of the facet results for field "foo" to get a limit of 100 overriding the "global" limit of 10 – that shouldn't change when this feature is added, and skimming Uri's original patch i'm pretty sure it would: getFieldParams("bar","facet.limit") isn't going to pay any attention to "f.foo.facet.limit" at all, would look for "f.bar.facet.limit" and if it's not found then it would just use the value of "facet.limit"

          #1 should be trivial, #2 is a big question mark: I have no idea if it will just work as is or if some special logic needs to be added (and if so what). #3 is probably going to be a little tricky just because it doesn't play nicely with any of the logic in SolrParams aboout how per-field overrides should work, so we may need a new method in SolrParams to deal with this, or maybe just make SolrParams.fname() public so SimpleFacets can (cleanly) check the things it wants to check directly with something like...

          static String getKeyOrDefault(final String key, final String field, 
                                        final String param, final SolrParams params) {
            String result = params.get(params.fname(key, param);
            if (null == result) {
              result = params.getFieldParam(field, param);
            }
            return result;
          }
          
          Show
          Hoss Man added a comment - Someone asked on the dev list about resurrecting Uri's patch... reviewing it now, i suspect that getting it to apply to the trunk (to work as Uri originally wrote it) would just be a simple matter of tweaking the paths and line numbers to get it to apply cleanly. There are a few other things that I think would need to be addressed before i'd be comfortable committing however... — 1) need to make sure range faceting is also supported 2) need a test to verify that things work as expected with distributed searches – some of the params (facet.limit, facet.mincount, facet.offset) have special handling in FacetComponent that may need tweaked to work properly when they are specified on a "key" instead of on a true field name. 3) need to make sure some precedence rules like those described in SOLR-2251 work, are tested, and documented. The main issue here is that if someone is already using the "key" local param for the purposes of filter exclusion, the precedence of per-field overrides on things like "facet.limit" should still apply and not just be ignored because of "per-key" overrides. ie: a request like this in Solr 3.4... facet.limit=10&facet.field=foo&facet.field={!key=bar ex=dt}foo&f.foo.facet.limit=100 ...causes both of the facet results for field "foo" to get a limit of 100 overriding the "global" limit of 10 – that shouldn't change when this feature is added, and skimming Uri's original patch i'm pretty sure it would: getFieldParams("bar","facet.limit") isn't going to pay any attention to "f.foo.facet.limit" at all, would look for "f.bar.facet.limit" and if it's not found then it would just use the value of "facet.limit" — #1 should be trivial, #2 is a big question mark: I have no idea if it will just work as is or if some special logic needs to be added (and if so what). #3 is probably going to be a little tricky just because it doesn't play nicely with any of the logic in SolrParams aboout how per-field overrides should work, so we may need a new method in SolrParams to deal with this, or maybe just make SolrParams.fname() public so SimpleFacets can (cleanly) check the things it wants to check directly with something like... static String getKeyOrDefault( final String key, final String field, final String param, final SolrParams params) { String result = params.get(params.fname(key, param); if ( null == result) { result = params.getFieldParam(field, param); } return result; }
          Hide
          Hoss Man added a comment -

          Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently.

          email notification suppressed to prevent mass-spam
          psuedo-unique token identifying these issues: hoss20120321nofix36

          Show
          Hoss Man added a comment - Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently. email notification suppressed to prevent mass-spam psuedo-unique token identifying these issues: hoss20120321nofix36
          Hide
          Robert Purdy added a comment -

          Hey all, has this been integrated yet into any current versions of SOLR or is there a current way in a newer version then SOLR 1.5 to perform multiple facet prefix queries on the same field? Any Help on this would be great as I am currently stuck from upgrading our system until this feature is available or unless I perform many queries per page like we used to before applying Uri's patch.

          Thanks Robert.

          Show
          Robert Purdy added a comment - Hey all, has this been integrated yet into any current versions of SOLR or is there a current way in a newer version then SOLR 1.5 to perform multiple facet prefix queries on the same field? Any Help on this would be great as I am currently stuck from upgrading our system until this feature is available or unless I perform many queries per page like we used to before applying Uri's patch. Thanks Robert.
          Hide
          Erick Erickson added a comment -

          Robert:

          The key is looking at the "Resolution :" field. That'll change to "Fixed" when it's been committed.....

          So no, this isn't in any version yet.

          Show
          Erick Erickson added a comment - Robert: The key is looking at the "Resolution :" field. That'll change to "Fixed" when it's been committed..... So no, this isn't in any version yet.
          Hide
          Robert Purdy added a comment -

          Thanks Erick, I was unsure.

          Show
          Robert Purdy added a comment - Thanks Erick, I was unsure.
          Hide
          Robert Purdy added a comment -

          I have modified Uri's patch to work with trunc. Seems to work fine and allow me to have multiple facet prefix queries on the same field.

          Show
          Robert Purdy added a comment - I have modified Uri's patch to work with trunc. Seems to work fine and allow me to have multiple facet prefix queries on the same field.
          Hide
          Ryan McKinley added a comment -

          adding tests from these patches.

          Rather then:

          
          +                    ,"facet.date", "{!key=foo}" + f
          +                    ,"f.foo.facet.date.start", "1976-07-01T00:00:00.000Z"
          +                    ,"f.foo.facet.date.end",   "1976-07-01T00:00:00.000Z+1MONTH"
          +                    ,"f.foo.facet.date.gap",   "+1DAY"
          +                    ,"f.foo.facet.date.other", "all"
          +                    ,"facet.date", "{!key=bar}" + f
          +                    ,"f.bar.facet.date.start", "1976-07-01T00:00:00.000Z"
          +                    ,"f.bar.facet.date.end",   "1976-07-01T00:00:00.000Z+7DAY"
          +                    ,"f.bar.facet.date.gap",   "+1DAY"
          

          We now have:

          
                          ,"facet.date", "{!key=foo " +
                            "facet.date.start=1976-07-01T00:00:00.000Z " +
                            "facet.date.end=1976-07-01T00:00:00.000Z+1MONTH " +
                            "facet.date.gap=+1DAY " +
                            "facet.date.other=all " +
                          "}" + f
                          ,"facet.date", "{!key=bar " +
                            "facet.date.start=1976-07-01T00:00:00.000Z " +
                            "facet.date.end=1976-07-01T00:00:00.000Z+7DAY " +
                            "facet.date.gap=+1DAY " +
                          "}" + f
          
          Show
          Ryan McKinley added a comment - adding tests from these patches. Rather then: + , "facet.date" , "{!key=foo}" + f + , "f.foo.facet.date.start" , "1976-07-01T00:00:00.000Z" + , "f.foo.facet.date.end" , "1976-07-01T00:00:00.000Z+1MONTH" + , "f.foo.facet.date.gap" , "+1DAY" + , "f.foo.facet.date.other" , "all" + , "facet.date" , "{!key=bar}" + f + , "f.bar.facet.date.start" , "1976-07-01T00:00:00.000Z" + , "f.bar.facet.date.end" , "1976-07-01T00:00:00.000Z+7DAY" + , "f.bar.facet.date.gap" , "+1DAY" We now have: , "facet.date" , "{!key=foo " + "facet.date.start=1976-07-01T00:00:00.000Z " + "facet.date.end=1976-07-01T00:00:00.000Z+1MONTH " + "facet.date.gap=+1DAY " + "facet.date.other=all " + "}" + f , "facet.date" , "{!key=bar " + "facet.date.start=1976-07-01T00:00:00.000Z " + "facet.date.end=1976-07-01T00:00:00.000Z+7DAY " + "facet.date.gap=+1DAY " + "}" + f
          Hide
          Ryan McKinley added a comment -

          While the style is different then proposed in this patch, you can get the same results using localParams syntax from SOLR-4717

          Show
          Ryan McKinley added a comment - While the style is different then proposed in this patch, you can get the same results using localParams syntax from SOLR-4717
          Hide
          Uwe Schindler added a comment -

          Closed after release.

          Show
          Uwe Schindler added a comment - Closed after release.

            People

            • Assignee:
              Ryan McKinley
              Reporter:
              Yonik Seeley
            • Votes:
              13 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development