Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6192

Enhance MLlib's Python API (GSoC 2015)

    XMLWordPrintableJSON

Details

    Description

      This is an umbrella JIRA for MechCoder's GSoC 2015 project. The main theme is to enhance MLlib's Python API, to make it on par with the Scala/Java API. The main tasks are:

      1. For all models in MLlib, provide save/load method. This also
      includes save/load in Scala.
      2. Python API for evaluation metrics.
      3. Python API for streaming ML algorithms.
      4. Python API for distributed linear algebra.
      5. Simplify MLLibPythonAPI using DataFrames. Currently, we use
      customized serialization, making MLLibPythonAPI hard to maintain. It
      would be nice to use the DataFrames for serialization.

      I'll link the JIRAs for each of the tasks.

      Note that this doesn't mean all these JIRAs are pre-assigned to MechCoder. The TODO list will be dynamic based on the backlog.

      Attachments

        Issue Links

          Activity

            People

              MechCoder Manoj Kumar
              mengxr Xiangrui Meng
              Xiangrui Meng Xiangrui Meng
              Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: