Details
Description
Before the release, we need to update the MLlib, GraphX, and SparkR Programming Guides. Updates will include:
- Add migration guide subsection.
- Use the results of the QA audit JIRAs and
SPARK-13448.
- Use the results of the QA audit JIRAs and
- Check phrasing, especially in main sections (for outdated items such as "In this release, ...")
For MLlib, we will make the DataFrame-based API (spark.ml) front-and-center, to make it clear the RDD-based API is the older, maintenance-mode one.
- No docs for spark.mllib will be deleted; they will just be reorganized and put in a subsection.
- If spark.ml docs are less complete, or if spark.ml docs say "refer to the spark.mllib docs for details," then we should copy those details to the spark.ml docs. This per-feature work can happen under
SPARK-14815. - This big reorganization should be done after docs are added for each feature (to minimize merge conflicts).
Attachments
Issue Links
- contains
-
SPARK-15643 ML 2.0 QA: migration guide update
- Resolved
-
SPARK-12071 Programming guide should explain NULL in JVM translate to NA in R
- Resolved
- is blocked by
-
SPARK-14815 ML, Graph, R 2.0 QA: Update user guide for new features & APIs
- Resolved
- relates to
-
SPARK-13448 Document MLlib behavior changes in Spark 2.0
- Resolved
- links to