Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6192

Enhance MLlib's Python API (GSoC 2015)

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      This is an umbrella JIRA for Manoj Kumar's GSoC 2015 project. The main theme is to enhance MLlib's Python API, to make it on par with the Scala/Java API. The main tasks are:

      1. For all models in MLlib, provide save/load method. This also
      includes save/load in Scala.
      2. Python API for evaluation metrics.
      3. Python API for streaming ML algorithms.
      4. Python API for distributed linear algebra.
      5. Simplify MLLibPythonAPI using DataFrames. Currently, we use
      customized serialization, making MLLibPythonAPI hard to maintain. It
      would be nice to use the DataFrames for serialization.

      I'll link the JIRAs for each of the tasks.

      Note that this doesn't mean all these JIRAs are pre-assigned to Manoj Kumar. The TODO list will be dynamic based on the backlog.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            MechCoder Manoj Kumar
            mengxr Xiangrui Meng
            Xiangrui Meng Xiangrui Meng
            Votes:
            2 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment