Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2438

Streaming + MLLib

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1.0
    • Component/s: DStreams, MLlib
    • Labels:
    • Target Version/s:

      Description

      This is a ticket to track progress on developing streaming analyses in MLLib.

      Many streaming applications benefit from or require fitting models online, where the parameters of a model (e.g. regression, clustering) are updated continually as new data arrive. This can be accomplished by incorporating MLLib algorithms into model-updating operations over DStreams. In some cases this can be achieved using existing updaters (e.g. those based on SGD), but in other cases will require custom update rules (e.g. for KMeans). The goal is to have streaming versions of many common algorithms, in particular regression, classification, clustering, and possibly dimensionality reduction.

        Attachments

          Activity

            People

            • Assignee:
              tdas Tathagata Das
              Reporter:
              freeman-lab Jeremy Freeman
            • Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: