Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-990

SVM - novelty detection using 1-class SVM

    XMLWordPrintableJSON

Details

    Description

      Story

      As a data scientist, I want to use a one-class SVM so that I can decide whether a new observation belongs to the same distribution as existing observations (an inlier), or should be considered as different (an outlier).

      Acceptance

      1) One-class SVM implemented with all supported kernel types (linear, gaussian, polynomial).
      2) Output a T/F for not-novel/novel.

      Note

      a) Similar e1071 R package [1] with
      type=one-classification (for novelty detection)

      b) There is an important distinction between novelty detection (this story) and outlier detection for cleaning training data. From reference [2]:

      • novelty detection: the training data is not polluted by outliers, and we are interested in detecting anomalies in new observations. <- this story
      • outlier detection: the training data contains outliers, and we need to fit the central mode of the training data, ignoring the deviant observations. <- we are not trying to solve this unsupervised learning problem in this story.

      References

      [1] e1071 R package
      https://cran.r-project.org/web/packages/e1071/index.html

      [2] Difference between novelty and outlier detection
      http://scikit-learn.org/stable/modules/outlier_detection.html

      Attachments

        Issue Links

          Activity

            People

              njayaram Nandish Jayaram
              fmcquillan Frank McQuillan
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: