Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3568

Add metrics for ranking algorithms

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.2.0
    • MLlib
    • None

    Description

      Include common metrics for ranking algorithms (http://www-nlp.stanford.edu/IR-book/), including:

      • Mean Average Precision
      • Precision@n: top-n precision
      • Discounted cumulative gain (DCG) and NDCG

      This implementation attempts to create a new class called RankingMetrics under org.apache.spark.mllib.evaluation, which accepts input (prediction and label pairs) as RDD[Array[T], Array[T]]. The following methods will be implemented:

      RankingMetrics.scala
      class RankingMetrics[T](predictionAndLabels: RDD[(Array[T], Array[T])]) {
        /* Returns the precsion@k for each query */
        lazy val precAtK: RDD[Array[Double]]
      
        /**
         * @param k the position to compute the truncated precision
         * @return the average precision at the first k ranking positions
         */
        def precision(k: Int): Double
      
        /* Returns the average precision for each query */
        lazy val avePrec: RDD[Double]
      
        /*Returns the mean average precision (MAP) of all the queries*/
        lazy val meanAvePrec: Double
      
        /*Returns the normalized discounted cumulative gain for each query */
        lazy val ndcgAtK: RDD[Array[Double]]
      
        /**
         * @param k the position to compute the truncated ndcg
         * @return the average ndcg at the first k ranking positions
         */
        def ndcg(k: Int): Double
      }
      

      Attachments

        Issue Links

          Activity

            People

              coderxiang Shuo Xiang
              coderxiang Shuo Xiang
              Xiangrui Meng Xiangrui Meng
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: