Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
None
-
None
Description
We currently restrict decision trees (DecisionTree, GBT, RandomForest) to be of maxDepth <= 30. We should remove this restriction to support deep (imbalanced) trees.
Trees store an index for each node, where each index corresponds to a unique position in a binary tree. (I.e., the first index of row 0 is 1, the first of row 1 is 2, the first of row 2 is 4, etc., IIRC)
With some careful thought, we could probably avoid using indices altogether.
Attachments
Issue Links
- Is contained by
-
SPARK-14045 DecisionTree improvement umbrella
- Resolved
- relates to
-
SPARK-3162 Train DecisionTree locally when possible
- Resolved