Details
-
Sub-task
-
Status: Resolved
-
Blocker
-
Resolution: Done
-
2.3.0
-
None
-
None
Description
For new public APIs added to MLlib (spark.ml only), we need to check the generated HTML doc and compare the Scala & Python versions.
- GOAL: Audit and create JIRAs to fix in the next release.
- NON-GOAL: This JIRA is not for fixing the API parity issues.
We need to track:
- Inconsistency: Do class/method/parameter names match?
- Docs: Is the Python doc missing or just a stub? We want the Python doc to be as complete as the Scala doc.
- API breaking changes: These should be very rare but are occasionally either necessary (intentional) or accidental. These must be recorded and added in the Migration Guide for this release.
- Note: If the API change is for an Alpha/Experimental/DeveloperApi component, please note that as well.
- Missing classes/methods/parameters: We should create to-do JIRAs for functionality missing from Python, to be added in the next release cycle. Please use a separate JIRA (linked below as "requires") for this list of to-do items.
Attachments
Issue Links
- requires
-
SPARK-22005 CrossValidator, TrainValidationSplit dump sub models to disk when fitting: Python API
- Resolved
-
SPARK-22796 Add multiple column support to PySpark QuantileDiscretizer
- Resolved
-
SPARK-22797 Add multiple column support to PySpark Bucketizer
- Resolved
-
SPARK-21741 Python API for DataFrame-based multivariate summarizer
- Resolved
-
SPARK-23161 Add missing APIs to Python GBTClassifier
- Resolved
-
SPARK-23162 PySpark ML LinearRegressionSummary missing r2adj
- Resolved
-
SPARK-23256 Add columnSchema method to PySpark image reader
- Resolved
-
SPARK-23163 Sync Python ML API docs with Scala
- Resolved