Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9193

Add scoreNodes Streaming Expression

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Resolved
    • Affects Version/s: None
    • Fix Version/s: 6.2
    • Component/s: SolrJ
    • Labels:
      None

      Description

      The scoreNodes Streaming Expression is another GraphExpression. It will decorate a gatherNodes expression and use a tf-idf scoring algorithm to score the nodes.

      The gatherNodes expression only gathers nodes and aggregations. This is similar in nature to tf in search ranking, where the number of times a node appears in the traversal represents the tf. But this skews recommendations towards nodes that appear frequently in the index.

      Using the idf for each node we can score each node as a function of tf-idf. This will provide a boost to nodes that appear less frequently in the index.

      The scoreNodes expression will gather the idf's from the shards for each node emitted by the underlying gatherNodes expression. It will then assign the score to each node.

      The computed score will be added to each node in the nodeScore field. The docFreq of the node across the entire collection will be added to each node in the docFreq field. Other streaming expressions can then perform a ranking based on the nodeScore or compute their own score using the nodeFreq.

      proposed syntax:

      top(n="10",
            sort="nodeScore desc",
            scoreNodes(gatherNodes(...))) 
      

        Attachments

        1. SOLR-9193.patch
          22 kB
          Joel Bernstein

          Issue Links

            Activity

              People

              • Assignee:
                joel.bernstein Joel Bernstein
                Reporter:
                joel.bernstein Joel Bernstein
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: