Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13047

Add facet2D Streaming Expression

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Resolved
    • Affects Version/s: None
    • Fix Version/s: 8.2
    • Component/s: None
    • Labels:
      None

      Description

      The current facet expression is a generic tool for creating multi-dimension aggregations. The facet2D Streaming Expression has semantics specific for 2 dimensional facets which are designed to be pivoted into a matrix and operated on by Math Expressions

      facet2D will use the json facet API under the covers. 

      Proposed syntax:

      facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", count(*))

      The example above will return tuples containing the top 300 diseases and the top ten symptoms for each disease. 

      Using math expression the tuples can be pivoted into a matrix where the rows of the matrix are the diseases, the columns of the matrix are the symptoms and the cells in the matrix contain the counts. This matrix can then be clustered to find clusters of diseases that are correlated by symptoms

      let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", count(*)),
          b=pivot(a, diseases, symptoms, count(*)),
          c=kmeans(b, 10))

       

      Implementation Note:

      The implementation plan for this ticket is to create a new stream called Facet2DStream. The FacetStream code is a good starting point for the new implementation and can be adapted for the Facet2D parameters. Similar tests to the FacetStream can be added to StreamExpressionTest

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                joel.bernstein Joel Bernstein
                Reporter:
                joel.bernstein Joel Bernstein
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h