Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9519

Check sub-facets of empty facet buckets for operations that may expand the domain (like filter exclusions)

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.4, 7.0
    • Component/s: Facet Module
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None

      Description

      http://markmail.org/message/bgplt2qdxc7gqga5

      Background: the JSON Facet API does not execute sub-facets for a facet
      bucket with a 0 count (and the root facet bucket is like any other
      facet bucket).
      This was to help prevent the combinatorial explosion of deeply nested
      sub-facets with useless information.

      This is obviously incorrect though, when a sub-facet does something
      that can expand the domain rather than just restrict it. Facet
      exclusions are one of these cases.
      For zero facet buckets, we should check if any sub-facets have these
      properties and then recurse if so.

      Aside: not processing empty sets also helped with issues like what junk values to fill in for statistics... min, max, average, std, etc. JSON doesn't even officially support NaN, so it's nice to be able to leave these junk values out in many circumstances.

      1. SOLR-9519.patch
        6 kB
        Yonik Seeley
      2. SOLR-9519.patch
        6 kB
        Yonik Seeley
      3. SOLR-9519.patch
        2 kB
        Michael Sun
      4. SOLR-9519.patch
        2 kB
        Michael Sun
      5. SOLR-9519.patch
        1 kB
        Michael Sun

        Activity

        Hide
        michael.sun Michael Sun added a comment -

        Here is a patch. It basically does a check if a zero domain has any sub-facet that can alter domain.

        Show
        michael.sun Michael Sun added a comment - Here is a patch. It basically does a check if a zero domain has any sub-facet that can alter domain.
        Hide
        michael.sun Michael Sun added a comment -

        Here is a new patch with fix and test.

        Show
        michael.sun Michael Sun added a comment - Here is a new patch with fix and test.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment - - edited

        The domain check may need to be recursive. A direct child may not change the domain in a non-narrowing way, but a child of that child may.

        Further, I wonder if we should only execute those sub-facets that do have the domain change.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - - edited The domain check may need to be recursive. A direct child may not change the domain in a non-narrowing way, but a child of that child may. Further, I wonder if we should only execute those sub-facets that do have the domain change.
        Hide
        michael.sun Michael Sun added a comment - - edited

        The domain check may need to be recursive. A direct child may not change the domain in a non-narrowing way, but a child of that child may.

        Thanks Yonik Seeley for reviewing. Can you also give an example for this use case? I tried a query like the following and got expected result.

        One guess for the use case is a query without domain in first level facet but that seems not right.

        curl http://localhost:8983/solr/films/select -d 'q=*:*&fq={!tag=GENRE}genre:xxx``&wt=json&indent=true&json.facet={
        top_genre: {
          type:terms,
          field:genre,
          numBucket:true,
          limit:3,
          domain:{excludeTags:GENRE},
          facet: {
            top_director: {
                type:terms,
                field:directed_by,
                numBuckets:true,
                limit:3,
                domain:{excludeTags:GENRE}
            }
          }
        }
        }'
        
        Show
        michael.sun Michael Sun added a comment - - edited The domain check may need to be recursive. A direct child may not change the domain in a non-narrowing way, but a child of that child may. Thanks Yonik Seeley for reviewing. Can you also give an example for this use case? I tried a query like the following and got expected result. One guess for the use case is a query without domain in first level facet but that seems not right. curl http: //localhost:8983/solr/films/select -d 'q=*:*&fq={!tag=GENRE}genre:xxx``&wt=json&indent= true &json.facet={ top_genre: { type:terms, field:genre, numBucket: true , limit:3, domain:{excludeTags:GENRE}, facet: { top_director: { type:terms, field:directed_by, numBuckets: true , limit:3, domain:{excludeTags:GENRE} } } } }'
        Hide
        michael.sun Michael Sun added a comment -

        Thanks Yonik Seeley for suggestions. A new patch is uploaded which checks domain change recursively.

        Show
        michael.sun Michael Sun added a comment - Thanks Yonik Seeley for suggestions. A new patch is uploaded which checks domain change recursively.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Here's an update that only executes sub-facets that can produce something from an empty bucket, rather than executing all sub-facets if any of them can.
        Updated tests as well.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Here's an update that only executes sub-facets that can produce something from an empty bucket, rather than executing all sub-facets if any of them can. Updated tests as well.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Another update, this time just more tests to ensure that the constraint filters for intermediate facets are still applied when the domain becomes non-empty in the sub-facets.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Another update, this time just more tests to ensure that the constraint filters for intermediate facets are still applied when the domain becomes non-empty in the sub-facets.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit cfcf4081fcf04cf2e1d6293a05a2005f0a99942c in lucene-solr's branch refs/heads/master from Yonik Seeley
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cfcf408 ]

        SOLR-9519: recurse sub-facets of empty buckets if they can widen domain again

        Show
        jira-bot ASF subversion and git services added a comment - Commit cfcf4081fcf04cf2e1d6293a05a2005f0a99942c in lucene-solr's branch refs/heads/master from Yonik Seeley [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cfcf408 ] SOLR-9519 : recurse sub-facets of empty buckets if they can widen domain again
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit ecff32633119687eb2cd9fc4bd78a865cbdd6893 in lucene-solr's branch refs/heads/branch_6x from Yonik Seeley
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ecff326 ]

        SOLR-9519: recurse sub-facets of empty buckets if they can widen domain again

        Show
        jira-bot ASF subversion and git services added a comment - Commit ecff32633119687eb2cd9fc4bd78a865cbdd6893 in lucene-solr's branch refs/heads/branch_6x from Yonik Seeley [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ecff326 ] SOLR-9519 : recurse sub-facets of empty buckets if they can widen domain again
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Committed. Thanks!

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Committed. Thanks!

          People

          • Assignee:
            Unassigned
            Reporter:
            yseeley@gmail.com Yonik Seeley
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development