Description
There has been quite a bit of excitement around xgboost: https://github.com/dmlc/xgboost
It improves the parallelism of boosting by mixing boosting and bagging (where bagging makes the algorithm more parallel).
It would worth exploring implementing this within MLlib (probably as a new algorithm).
Attachments
Issue Links
- Is contained by
-
SPARK-14047 GBT improvement umbrella
- Resolved
- relates to
-
SPARK-4240 Refine Tree Predictions in Gradient Boosting to Improve Prediction Accuracy.
- Resolved
-
SPARK-13868 Random forest accuracy exploration
- Resolved