Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Cleanup of existing decision tree algorithm in SystemDS "scripts/algorithms/decision-tree.dml" and convert it into dml builtin function.
As a preparation for a holistic cleanup of decision tree and random forest scripts (in scripts/builtin and scripts/algorithms), we should first introduce primitives for information gain, entropy, and gini (SYSTEMDS-3184), and devise vectorized prediction scripts.
In a first step, this task should introduce a new builtin/decisionTreePredict that implements the different strategies of the Hummingbird paper [1]. Initial tests can hard-code the vectorized decision tree representation and focus on testing the prediction procedure. The builtin function signature might expose a 'method' attribute to select the different strategies.
[1] Supun Nakandala, Karla Saur, Gyeong-In Yu, Konstantinos Karanasos, Carlo Curino, Markus Weimer, Matteo Interlandi:
A Tensor Compiler for Unified Machine Learning Prediction Serving. OSDI 2020: 899-917, https://www.usenix.org/system/files/osdi20-nakandala.pdf
Attachments
Issue Links
- is blocked by
-
SYSTEMDS-3184 Builtin for computing information gain using entropy and gini
- Resolved
- links to
1.
|
Decision Tree Predict | Resolved | Unassigned | |
2.
|
Cleanup decisionTree builtin | Closed | Unassigned | |
3.
|
Cleanup randomForest builtin | Closed | Matthias Boehm | |
4.
|
Additional decisionTree/randomForest inference methods | Closed | Matthias Boehm | |
5.
|
Optional Data Compaction in decisionTree/randomForest | Closed | Matthias Boehm |