Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
None
-
None
Description
Some simple instrumentation would help advanced users understand performance, and to check whether parameters (such as maxMemoryInMB) need to be tuned.
Most important instrumentation (simple):
- min, avg, max nodes per group
- number of groups (passes over data)
More advanced instrumentation:
- For each tree (or averaged over trees), training set accuracy after training each level. This would be useful for visualizing learning behavior (to convince oneself that model selection was being done correctly).
Attachments
Issue Links
- Is contained by
-
SPARK-14045 DecisionTree improvement umbrella
- Resolved
-
SPARK-14046 RandomForest improvement umbrella
- Resolved
- links to