It would be nice to be able to use RF and GBT for feature transformation:
First fit an ensemble of trees (like RF, GBT or other TreeEnsambleModels) on the training set. Then each leaf of each tree in the ensemble is assigned a fixed arbitrary feature index in a new feature space. These leaf indices are then encoded in a one-hot fashion.
This method was first introduced by facebook(http://www.herbrich.me/papers/adclicksfacebook.pdf), and is implemented in famous libraries:
Refering to the design of above impls, I propose following api:
val model1 : DecisionTreeClassificationModel= ...
val model2 : GBTClassificationModel = ...
The detailed design doc: https://docs.google.com/document/d/1d81qS0zfb6vqbt3dn6zFQUmWeh2ymoRALvhzPpTZqvo/edit?usp=sharing