[SPARK-17906] MulticlassClassificationEvaluator support target label - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Brainstorming
Status: Resolved
Priority: Minor
Resolution: Not A Problem
Affects Version/s: None
Fix Version/s: None
Component/s: ML
Labels:
None

Description

In practice, I sometime only focus on metric of one special label.
For example, in CTR prediction, I usually only mind F1 of positive class.

In sklearn, this is supported:

>>> from sklearn.metrics import classification_report
>>> y_true = [0, 1, 2, 2, 2]
>>> y_pred = [0, 0, 2, 2, 1]
>>> target_names = ['class 0', 'class 1', 'class 2']
>>> print(classification_report(y_true, y_pred, target_names=target_names))
             precision    recall  f1-score   support

    class 0       0.50      1.00      0.67         1
    class 1       0.00      0.00      0.00         1
    class 2       1.00      0.67      0.80         3

avg / total       0.70      0.60      0.61         5

Now, ml only support `weightedXXX`. So I think there may be a point to improve.

The API may be designed like this:

val dataset = ...
val evaluator = new MulticlassClassificationEvaluator
evaluator.setMetricName("f1")
evaluator.evaluate(dataset)       // weightedF1 of all classes

evaluator.setTarget(0.0).setMetricName("f1")
evaluator.evaluate(dataset)       // F1 of class "0"

what's your opinion? yanboliang josephkb sethah srowen
If this is useful and acceptable, I'm happy to work on this.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Ruifeng Zheng

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 13/Oct/16 12:59

Updated:: 08/May/19 10:36

Resolved:: 08/May/19 10:36