Description
Via some test I found CrossValidator still exists memory issue, it will still occupy `O(n*sizeof(model))` for holding models when fitting, if well optimized, it should be `O(parallelism*sizeof(model))`
This is because modelFutures will hold the reference to model object after future is complete (we can use `future.value.get.get` to fetch it), and the `Future.sequence` and the `modelFutures` array holds references to each model future. So all model object are keep referenced. So it will still occupy `O(n*sizeof(model))` memory.
Attachments
Issue Links
- is related to
-
SPARK-22949 Reduce memory requirement for TrainValidationSplit
- Resolved
- links to