Description
Include common metrics for ranking algorithms (http://www-nlp.stanford.edu/IR-book/), including:
- Mean Average Precision
- Precision@n: top-n precision
- Discounted cumulative gain (DCG) and NDCG
This implementation attempts to create a new class called RankingMetrics under org.apache.spark.mllib.evaluation, which accepts input (prediction and label pairs) as RDD[Array[T], Array[T]]. The following methods will be implemented:
RankingMetrics.scala
class RankingMetrics[T](predictionAndLabels: RDD[(Array[T], Array[T])]) { /* Returns the precsion@k for each query */ lazy val precAtK: RDD[Array[Double]] /** * @param k the position to compute the truncated precision * @return the average precision at the first k ranking positions */ def precision(k: Int): Double /* Returns the average precision for each query */ lazy val avePrec: RDD[Double] /*Returns the mean average precision (MAP) of all the queries*/ lazy val meanAvePrec: Double /*Returns the normalized discounted cumulative gain for each query */ lazy val ndcgAtK: RDD[Array[Double]] /** * @param k the position to compute the truncated ndcg * @return the average ndcg at the first k ranking positions */ def ndcg(k: Int): Double }
Attachments
Issue Links
- is related to
-
SPARK-18948 Add Mean Percentile Rank metric for ranking algorithms
- Resolved
- links to