Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3251

Clarify learning interfaces

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Auto Closed
    • Affects Version/s: 1.1.0, 1.1.1
    • Fix Version/s: None
    • Component/s: MLlib
    • Labels:

      Description

      Make threshold mandatory
      Currently, the output of predict for an example is either the score
      or the class. This side-effect is caused by clearThreshold. To
      clarify that behaviour three different types of predict (predictScore,
      predictClass, predictProbabilty) were introduced; the threshold is not
      longer optional.

      Clarify classification interfaces
      Currently, some functionality is spreaded over multiple models.
      In order to clarify the structure and simplify the implementation of
      more complex models (like multinomial logistic regression), two new
      classes are introduced:

      • BinaryClassificationModel: for all models that derives a binary classification from a single weight vector. Comprises the tresholding functionality to derive a prediction from a score. It basically captures SVMModel and LogisticRegressionModel.
      • ProbabilitistClassificaitonModel: This trait defines the interface for models that return a calibrated confidence score (aka probability).

      Misc

      • some renaming
      • add test for probabilistic output

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                BigCrunsh Christoph Sawade
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: