Details
-
Sub-task
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
-
None
Description
For new public APIs added to MLlib (spark.ml only), we need to check the generated HTML doc and compare the Scala & Python versions.
- GOAL: Audit and create JIRAs to fix in the next release.
- NON-GOAL: This JIRA is not for fixing the API parity issues.
We need to track:
- Inconsistency: Do class/method/parameter names match?
- Docs: Is the Python doc missing or just a stub? We want the Python doc to be as complete as the Scala doc.
- API breaking changes: These should be very rare but are occasionally either necessary (intentional) or accidental. These must be recorded and added in the Migration Guide for this release.
- Note: If the API change is for an Alpha/Experimental/DeveloperApi component, please note that as well.
- Missing classes/methods/parameters: We should create to-do JIRAs for functionality missing from Python, to be added in the next release cycle. Please use a separate JIRA (linked below as "requires") for this list of to-do items.
Attachments
Issue Links
- requires
-
SPARK-18080 Locality Sensitive Hashing (LSH) Python API
- Resolved
-
SPARK-18282 Add model summaries for Python GMM and BisectingKMeans
- Resolved
-
SPARK-18366 Add handleInvalid to Pyspark for QuantileDiscretizer and Bucketizer
- Resolved
-
SPARK-18369 Deprecate runs in Pyspark mllib KMeans
- Resolved