Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
None
-
None
Description
DecisionTree and RandomForest currently predict the most likely label for classification and the mean for regression. Other info about predictions would be useful.
For classification: estimated probability of each possible label
For regression: variance of estimate
RandomForest could also create aggregate predictions in multiple ways:
- Predict mean or median value for regression.
- Compute variance of estimates (across all trees) for both classification and regression.
Attachments
Issue Links
- is blocked by
-
SPARK-6113 Stabilize DecisionTree and ensembles APIs
- Resolved
-
SPARK-7131 Move tree,forest implementation from spark.mllib to spark.ml
- Resolved
- is duplicated by
-
SPARK-4736 functions returning the category with weights
- Resolved
- is related to
-
SPARK-4240 Refine Tree Predictions in Gradient Boosting to Improve Prediction Accuracy.
- Resolved