Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9193

Add scoreNodes Streaming Expression

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Resolved
    • None
    • 6.2
    • SolrJ
    • None

    Description

      The scoreNodes Streaming Expression is another GraphExpression. It will decorate a gatherNodes expression and use a tf-idf scoring algorithm to score the nodes.

      The gatherNodes expression only gathers nodes and aggregations. This is similar in nature to tf in search ranking, where the number of times a node appears in the traversal represents the tf. But this skews recommendations towards nodes that appear frequently in the index.

      Using the idf for each node we can score each node as a function of tf-idf. This will provide a boost to nodes that appear less frequently in the index.

      The scoreNodes expression will gather the idf's from the shards for each node emitted by the underlying gatherNodes expression. It will then assign the score to each node.

      The computed score will be added to each node in the nodeScore field. The docFreq of the node across the entire collection will be added to each node in the docFreq field. Other streaming expressions can then perform a ranking based on the nodeScore or compute their own score using the nodeFreq.

      proposed syntax:

      top(n="10",
            sort="nodeScore desc",
            scoreNodes(gatherNodes(...))) 
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jbernste Joel Bernstein
            jbernste Joel Bernstein
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment