Description
The ML Pipeline implementation of trees has diverged sufficiently from MLlib to warrant it's own user guide. Notable features which only exist in ML include:
- Class probabilities (
SPARK-6884,SPARK-6885) - Random Forest Feature importance (
SPARK-5133) - Classification thresholds (
SPARK-8069) - Checkpointing (
SPARK-6684)
We should add a new section to ml-guide#algorithm-guides to describe trees in ML pipeline
Attachments
Issue Links
- is related to
-
SPARK-6885 Decision trees: predict class probabilities
- Resolved
-
SPARK-6884 Random forest: predict class probabilities
- Closed
-
SPARK-5133 Feature Importance for Random Forests
- Resolved
-
SPARK-6684 Add checkpointing to GradientBoostedTrees
- Resolved
-
SPARK-8069 Add support for cutoff to RandomForestClassifier
- Resolved
-
SPARK-7772 User guide for spark.ml trees and ensembles
- Resolved
- is required by
-
SPARK-9668 ML 1.5 QA: Docs: Check for new APIs
- Resolved