[SPARK-3251] Clarify learning interfaces - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Auto Closed
Affects Version/s: 1.1.0, 1.1.1
Fix Version/s: None
Component/s: MLlib
Labels:
- bulk-closed

Description

Make threshold mandatory
Currently, the output of predict for an example is either the score
or the class. This side-effect is caused by clearThreshold. To
clarify that behaviour three different types of predict (predictScore,
predictClass, predictProbabilty) were introduced; the threshold is not
longer optional.

Clarify classification interfaces
Currently, some functionality is spreaded over multiple models.
In order to clarify the structure and simplify the implementation of
more complex models (like multinomial logistic regression), two new
classes are introduced:

BinaryClassificationModel: for all models that derives a binary classification from a single weight vector. Comprises the tresholding functionality to derive a prediction from a score. It basically captures SVMModel and LogisticRegressionModel.
ProbabilitistClassificaitonModel: This trait defines the interface for models that return a calibrated confidence score (aka probability).

Misc

some renaming
add test for probabilistic output

Attachments

Issue Links

relates to

SPARK-3702 Standardize MLlib classes for learners, models

Closed

SPARK-10817 ML abstraction umbrella

Resolved

links to

[Github] Pull Request #2137 (BigCrunsh)

Activity

People

Assignee:: Unassigned

Reporter:: Christoph Sawade

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 27/Aug/14 13:05

Updated:: 06/Jun/19 13:57

Resolved:: 06/Jun/19 13:57