Description
This JIRA is for discussing replacing the spark.mllib DecisionTree and RandomForest implementations with the implementation in spark.ml. The new implementation is simply a copy, with slight modifications (removing "bins").
Pros:
- Support only 1 implementation.
- Efficiency gains in spark.ml will benefit both APIs.
Cons:
- As spark.ml tree functionality increases, we will need to maintain conversion code for converting spark.ml trees to spark.mllib trees.
Must:
- Ensure we do not have significant regressions in the new implementation.
Attachments
Attachments
Issue Links
- blocks
-
SPARK-12183 Remove spark.mllib tree, forest implementations and use spark.ml
- Resolved
- relates to
-
SPARK-7131 Move tree,forest implementation from spark.mllib to spark.ml
- Resolved