Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6192

Enhance MLlib's Python API (GSoC 2015)

    XMLWordPrintableJSON

    Details

    • Target Version/s:

      Description

      This is an umbrella JIRA for Manoj Kumar's GSoC 2015 project. The main theme is to enhance MLlib's Python API, to make it on par with the Scala/Java API. The main tasks are:

      1. For all models in MLlib, provide save/load method. This also
      includes save/load in Scala.
      2. Python API for evaluation metrics.
      3. Python API for streaming ML algorithms.
      4. Python API for distributed linear algebra.
      5. Simplify MLLibPythonAPI using DataFrames. Currently, we use
      customized serialization, making MLLibPythonAPI hard to maintain. It
      would be nice to use the DataFrames for serialization.

      I'll link the JIRAs for each of the tasks.

      Note that this doesn't mean all these JIRAs are pre-assigned to Manoj Kumar. The TODO list will be dynamic based on the backlog.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                MechCoder Manoj Kumar
                Reporter:
                mengxr Xiangrui Meng
                Shepherd:
                Xiangrui Meng
              • Votes:
                2 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: