Description
Improvement: code clarity
Currently, we maintain a tree structure, a flat array of nodes, and a parentImpurities array.
Proposed fix: Maintain everything within a growing tree structure.
This would let us eliminate the flat array of nodes, thus saving storage when we do not grow a full tree. It would also potentially make it easier to pass subtrees to compute nodes for local training.
Note:
- This JIRA used to have this item as well: We could have a “LearningNode extends Node” setup where the LearningNode holds metadata for learning (such as impurities). The test-time model could be extracted from this training-time model, so that extra information (such as impurities) does not have to be kept after training.
- However, this is really a separate issue, so I removed it.