For new public APIs added to MLlib, we need to check the generated HTML doc and compare the Scala & Python versions. We need to track:
- Inconsistency: Do class/method/parameter names match?
- Docs: Is the Python doc missing or just a stub? We want the Python doc to be as complete as the Scala doc.
- API breaking changes: These should be very rare but are occasionally either necessary (intentional) or accidental. These must be recorded and added in the Migration Guide for this release.
- Note: If the API change is for an Alpha/Experimental/DeveloperApi component, please note that as well.
- Missing classes/methods/parameters: We should create to-do JIRAs for functionality missing from Python, to be added in the next release cycle. Please use a separate JIRA (linked below as "requires") for this list of to-do items.
- NOTE: These missing features should be added in the next release. This work is just to generate a list of to-do items for the future.
UPDATE: This only needs to cover spark.ml since spark.mllib is going into maintenance mode.