Details
-
New Feature
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
5.3
-
New, Patch Available
Description
As explained in the write-up, many state-of-the-art ranking model implementations are added to Apache Lucene.
This issue aims to include DFI model, which is the non-parametric counterpart of the Divergence from Randomness (DFR) framework.
DFI is both parameter-free and non-parametric:
- parameter-free: it does not require any parameter tuning or training.
- non-parametric: it does not make any assumptions about word frequency distributions on document collections.
It is highly recommended not to remove stopwords (very common terms: the, of, and, to, a, in, for, is, on, that, etc) with this similarity.
For more information see: A nonparametric term weighting method for information retrieval based on measuring the divergence from independence
Attachments
Attachments
Issue Links
- relates to
-
LUCENE-2959 [GSoC] Implementing State of the Art Ranking for Lucene
- Reopened