Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8909

Streaming Expressions should leverage streaming facets

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      The JSON Facet API can currently stream facets (use method=stream) from a single node. Each facet bucket is calculated as it is written out, so field cardinality has no effect on memory.

      This is only from a single node - normal distributed search/faceting does not stream... But that's what streaming expressions are for anyway!

      One current caveat: streaming currently only works with "sort=index asc" (the term order in the Lucene index).
      Future work could allow more complex sorts, at the cost of some memory to calculate the sort criteria for each bucket prior to streaming out. Of course more complex sorts would require more complex merging logic (i.e. even a sort by bucket count is not a simple merge sort and requires more buffering in the merging node).

        Issue Links

          Activity

          Hide
          joel.bernstein Joel Bernstein added a comment - - edited

          This is another powerful tool in the toolbox.

          We can probably build this into the FacetStream by adding the method param and a new code path to handle the merge.

          With the SQL handler, we can probably use this approach in most scenarios because we can re-order the Tuples by wrapping the FacetStream in a RankStream.

          The RollupStream will likely only need to be used following distributed joins.

          Show
          joel.bernstein Joel Bernstein added a comment - - edited This is another powerful tool in the toolbox. We can probably build this into the FacetStream by adding the method param and a new code path to handle the merge. With the SQL handler, we can probably use this approach in most scenarios because we can re-order the Tuples by wrapping the FacetStream in a RankStream. The RollupStream will likely only need to be used following distributed joins.

            People

            • Assignee:
              Unassigned
              Reporter:
              yseeley@gmail.com Yonik Seeley
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development