Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
None
-
None
Description
DecisionTree and RandomForest currently predict the most likely label for classification and the mean for regression. Other info about predictions would be useful.
For classification: estimated probability of each possible label
For regression: variance of estimate
RandomForest could also create aggregate predictions in multiple ways:
- Predict mean or median value for regression.
- Compute variance of estimates (across all trees) for both classification and regression.
Attachments
Issue Links
- is blocked by
-
SPARK-6113 Stabilize DecisionTree and ensembles APIs
- Resolved
-
SPARK-7131 Move tree,forest implementation from spark.mllib to spark.ml
- Resolved
- is duplicated by
-
SPARK-4736 functions returning the category with weights
- Resolved
- is related to
-
SPARK-4240 Refine Tree Predictions in Gradient Boosting to Improve Prediction Accuracy.
- Resolved
1.
|
Decision trees: predict class probabilities | Resolved | Yanbo Liang | ||||||||
2.
|
Random forest: predict class probabilities | Closed | Unassigned |
|
|||||||
3.
|
Make random forest classifier extend Classifier abstraction | Resolved | Holden Karau | ||||||||
4.
|
DecisionTreeRegressor: provide variance of prediction | Resolved | Yanbo Liang | ||||||||
5.
|
RandomForestRegressor: provide variance of predictions | Resolved | Manoj Kumar |