Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9851

Introduce Interesting Terms Json Facet

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Facet Module, faceting
    • None

    Description

      After playing a bit with the Lucene MLT I noticed a couple of methods were there for calculating the interesting terms ( from the seed document) .

      I think this can be extended to a supported calculation on the search results.
      Specifically I am thinking to initially add a new type of Json Facet ( InterestingTerms).

      This new aggregation will calculate the interesting terms from the search results given :

      • a field
      • a minCount ( we ignore the score calculus for terms occurring less than this threshold in the search results)
      • possibly all the other supported params for faceting

      Naive Implementation :
      Score for each term can be calculated as :
      count * IDF

      Observations
      Taking a look around the web, I see that a similar type of aggregation has already been included in Elastic Search time ago ( see nice blog from Mark at https://www.elastic.co/blog/significant-terms-aggregation )

      Any reason we don't have anything similar yet ?

      I will provide better design and more information soon.

      Attachments

        Activity

          People

            Unassigned Unassigned
            abenedetti Alessandro Benedetti
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: