Details
-
Sub-task
-
Status: Resolved
-
Blocker
-
Resolution: Done
-
None
-
None
Description
For new public APIs added to MLlib (spark.ml only), we need to check the generated HTML doc and compare the Scala & Python versions.
- GOAL: Audit and create JIRAs to fix in the next release.
- NON-GOAL: This JIRA is not for fixing the API parity issues.
We need to track:
- Inconsistency: Do class/method/parameter names match?
- Docs: Is the Python doc missing or just a stub? We want the Python doc to be as complete as the Scala doc.
- API breaking changes: These should be very rare but are occasionally either necessary (intentional) or accidental. These must be recorded and added in the Migration Guide for this release.
- Note: If the API change is for an Alpha/Experimental/DeveloperApi component, please note that as well.
- Missing classes/methods/parameters: We should create to-do JIRAs for functionality missing from Python, to be added in the next release cycle. Please use a separate JIRA (linked below as "requires") for this list of to-do items.
Attachments
Issue Links
- is related to
-
SPARK-20348 Support squared hinge loss (L2 loss) for LinearSVC
- Resolved
-
SPARK-20602 Adding LBFGS optimizer and Squared_hinge loss for LinearSVC
- Resolved
- requires
-
SPARK-20300 Python API for ALSModel.recommendForAllUsers,Items
- Resolved
-
SPARK-20601 Python API Changes for Constrained Logistic Regression Params
- Resolved
-
SPARK-19852 StringIndexer.setHandleInvalid should have another option 'new': Python API and docs
- Resolved
-
SPARK-19866 Add local version of Word2Vec findSynonyms for spark.ml: Python API
- Resolved
-
SPARK-20764 Fix visibility discrepancy with numInstances and degreesOfFreedom in LR and GLR - Python version
- Resolved
-
SPARK-20768 PySpark FPGrowth does not expose numPartitions (expert) param
- Resolved