Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 6.0
    • Fix Version/s: 6.0
    • Component/s: None
    • Labels:
      None

      Description

      This ticket makes the FacetStream (SOLR-7903) expressible, so it can be used as a Streaming Expression.

      1. SOLR-7904.patch
        40 kB
        Dennis Gove
      2. SOLR-7904.patch
        40 kB
        Dennis Gove
      3. SOLR-7904.patch
        39 kB
        Dennis Gove

        Issue Links

          Activity

          Hide
          dpgove Dennis Gove added a comment -

          I'm finalizing some of the tests but so far everything is passing fine. The expression format is as follows

          facet(
            collection1,
            q="*:*",
            fl="a_s,a_i,a_f",
            sort="a_s asc",
            buckets="a_s",
            bucketSorts="sum(a_i) asc",
            bucketSizeLimit=10,
            sum(a_i), sum(a_f),
            min(a_i), min(a_f),
            max(a_i), max(a_f),
            avg(a_i), avg(a_f),
            count(*),
            zkHost="url:port"
          )
          

          It supports multiple buckets and multiple bucketSorts. All standard query properties (q, fl, sort, etc...) are also supported. The example above is only showing 3 of them. zkHost is optional.

          Show
          dpgove Dennis Gove added a comment - I'm finalizing some of the tests but so far everything is passing fine. The expression format is as follows facet( collection1, q= "*:*" , fl= "a_s,a_i,a_f" , sort= "a_s asc" , buckets= "a_s" , bucketSorts= "sum(a_i) asc" , bucketSizeLimit=10, sum(a_i), sum(a_f), min(a_i), min(a_f), max(a_i), max(a_f), avg(a_i), avg(a_f), count(*), zkHost= "url:port" ) It supports multiple buckets and multiple bucketSorts. All standard query properties (q, fl, sort, etc...) are also supported. The example above is only showing 3 of them. zkHost is optional.
          Hide
          dpgove Dennis Gove added a comment -

          I did consider an alternative format that would put the bucket options together and allow for different things in each bucket but steered away from it because it would require larger changes to the FacetStream implementation and may not have a usecase

          facet(
            collection1,
            q="*:*",
            fl="a_s,b_s,a_i,a_f",
            sort="a_s asc",
            bucket("a_s", sort="sum(a_i) asc", limit=5, sum(a_i), avg(a_i), count(*)),
            bucket("b_s", sort="max(a_i) desc, min(a_i) desc", limit=20, sum(a_i), min(a_i), max(a_i)),
          )
          
          Show
          dpgove Dennis Gove added a comment - I did consider an alternative format that would put the bucket options together and allow for different things in each bucket but steered away from it because it would require larger changes to the FacetStream implementation and may not have a usecase facet( collection1, q= "*:*" , fl= "a_s,b_s,a_i,a_f" , sort= "a_s asc" , bucket( "a_s" , sort= "sum(a_i) asc" , limit=5, sum(a_i), avg(a_i), count(*)), bucket( "b_s" , sort= "max(a_i) desc, min(a_i) desc" , limit=20, sum(a_i), min(a_i), max(a_i)), )
          Hide
          joel.bernstein Joel Bernstein added a comment -

          I'll take a look again at the FacetStream impl. I think the fl and sort parameters are not needed. The StreamingTests have these params, but I think they were just pasted from another test. So this should work:

          facet(
            collection1,
            q="*:*",
            buckets="a_s",
            bucketSorts="sum(a_i) asc",
            bucketSizeLimit=10,
            sum(a_i), sum(a_f),
            min(a_i), min(a_f),
            max(a_i), max(a_f),
            avg(a_i), avg(a_f),
            count(*),
            zkHost="url:port"
          )
          
          
          Show
          joel.bernstein Joel Bernstein added a comment - I'll take a look again at the FacetStream impl. I think the fl and sort parameters are not needed. The StreamingTests have these params, but I think they were just pasted from another test. So this should work: facet( collection1, q= "*:*" , buckets= "a_s" , bucketSorts= "sum(a_i) asc" , bucketSizeLimit=10, sum(a_i), sum(a_f), min(a_i), min(a_f), max(a_i), max(a_f), avg(a_i), avg(a_f), count(*), zkHost= "url:port" )
          Hide
          joel.bernstein Joel Bernstein added a comment -

          Yeah, the FacetStream flattens out the hierarchical params of the JSON facet API. This works pretty well for SQL group by queries. But not nearly as expressive as the JSON facet API. But I think it's OK for version 1.

          Show
          joel.bernstein Joel Bernstein added a comment - Yeah, the FacetStream flattens out the hierarchical params of the JSON facet API. This works pretty well for SQL group by queries. But not nearly as expressive as the JSON facet API. But I think it's OK for version 1.
          Hide
          joel.bernstein Joel Bernstein added a comment -

          Also the result set is flattened to mimic the results of a SQL group by query.

          Show
          joel.bernstein Joel Bernstein added a comment - Also the result set is flattened to mimic the results of a SQL group by query.
          Hide
          dpgove Dennis Gove added a comment - - edited

          Alright. The expression parsing is similar to CloudSolrStream whereby some named parameters are required (buckets, bucketSorts, bucketSizeLimit) but the others are just passed down to the QueryRequest and are not considered explicitly. If fl and sort are not required then it'd just be a change in the documentation and not an implementation change (since the expression parsing doesn't explicitly look to ensure those were provided).

          Show
          dpgove Dennis Gove added a comment - - edited Alright. The expression parsing is similar to CloudSolrStream whereby some named parameters are required (buckets, bucketSorts, bucketSizeLimit) but the others are just passed down to the QueryRequest and are not considered explicitly. If fl and sort are not required then it'd just be a change in the documentation and not an implementation change (since the expression parsing doesn't explicitly look to ensure those were provided).
          Hide
          dpgove Dennis Gove added a comment -

          Fully implemented. All relevant tests pass.

          Show
          dpgove Dennis Gove added a comment - Fully implemented. All relevant tests pass.
          Hide
          dpgove Dennis Gove added a comment - - edited

          Adds facet as a default function in the StreamHandler.

          Show
          dpgove Dennis Gove added a comment - - edited Adds facet as a default function in the StreamHandler.
          Hide
          joel.bernstein Joel Bernstein added a comment -

          +1

          Looks good to me.

          Show
          joel.bernstein Joel Bernstein added a comment - +1 Looks good to me.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1719838 from dpgove@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1719838 ]

          SOLR-7904: Add StreamExpression Support to FacetStream

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1719838 from dpgove@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1719838 ] SOLR-7904 : Add StreamExpression Support to FacetStream
          Hide
          dpgove Dennis Gove added a comment -

          Rebased against trunk.

          Show
          dpgove Dennis Gove added a comment - Rebased against trunk.

            People

            • Assignee:
              dpgove Dennis Gove
              Reporter:
              joel.bernstein Joel Bernstein
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development