This is an umbrella JIRA for Manoj Kumar's GSoC 2015 project. The main theme is to enhance MLlib's Python API, to make it on par with the Scala/Java API. The main tasks are:
1. For all models in MLlib, provide save/load method. This also
includes save/load in Scala.
2. Python API for evaluation metrics.
3. Python API for streaming ML algorithms.
4. Python API for distributed linear algebra.
5. Simplify MLLibPythonAPI using DataFrames. Currently, we use
customized serialization, making MLLibPythonAPI hard to maintain. It
would be nice to use the DataFrames for serialization.
I'll link the JIRAs for each of the tasks.
Note that this doesn't mean all these JIRAs are pre-assigned to Manoj Kumar. The TODO list will be dynamic based on the backlog.