Description
Purpose: a metric for determining if the “relevancy” of a crawl after each round and the “relevancy” of a page. NB: this is not a scoring plugin. By default, the first 25 terms will be stored.
- Return the topN terms per a page
- Return the topN terms per a segment based on tf-idf
- Leverage Apache Lucene libs