[SPARK-2329] Add multi-label evaluation metrics - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.2.0
Component/s: MLlib
Labels:
None

Target Version/s:

1.2.0

Description

There is no class in Spark MLlib for measuring the performance of multi-label classifiers. Multilabel classification is when the document is labeled with several labels (classes).

This task involves adding the class for multilabel evaluation and unit tests. The following measures are to be implemented: Precision, Recall and F1-measure (1) based on documents averaged by the number of documents; (2) per label; (3) based on labels micro and macro averaged; (4) Hamming loss. Reference: Tsoumakas, Grigorios, Ioannis Katakis, and Ioannis Vlahavas. "Mining multi-label data." Data mining and knowledge discovery handbook. Springer US, 2010. 667-685.

Attachments

Issue Links

links to

[Github] Pull Request #1270 (avulanov)

Activity

People

Assignee:: Alexander Ulanov

Reporter:: Alexander Ulanov

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 30/Jun/14 12:53

Updated:: 01/Nov/14 01:31

Resolved:: 01/Nov/14 01:31

Time Tracking

Estimated:

72h

Remaining:

72h

Logged:

Not Specified