Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2329

Add multi-label evaluation metrics

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.2.0
    • MLlib
    • None

    Description

      There is no class in Spark MLlib for measuring the performance of multi-label classifiers. Multilabel classification is when the document is labeled with several labels (classes).

      This task involves adding the class for multilabel evaluation and unit tests. The following measures are to be implemented: Precision, Recall and F1-measure (1) based on documents averaged by the number of documents; (2) per label; (3) based on labels micro and macro averaged; (4) Hamming loss. Reference: Tsoumakas, Grigorios, Ioannis Katakis, and Ioannis Vlahavas. "Mining multi-label data." Data mining and knowledge discovery handbook. Springer US, 2010. 667-685.

      Attachments

        Activity

          People

            avulanov Alexander Ulanov
            avulanov Alexander Ulanov
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 72h
                72h
                Remaining:
                Remaining Estimate - 72h
                72h
                Logged:
                Time Spent - Not Specified
                Not Specified