Description
Multiple places in MLlib, we broadcast a model before prediction. Since prediction may be called many times, we should store the broadcast variable in a private var so that we broadcast at most once.
I'll link subtasks for each problem case I find.
Attachments
1.
|
ML model broadcasts should be stored in private vars: spark.ml tree ensembles | Closed | Unassigned | |
2.
|
ML model broadcasts should be stored in private vars: spark.ml Word2Vec | Closed | Unassigned | |
3.
|
ML model broadcasts should be stored in private vars: mllib NaiveBayes | Closed | Unassigned | |
4.
|
ML model broadcasts should be stored in private vars: mllib clustering | Closed | Unassigned | |
5.
|
ML model broadcasts should be stored in private vars: mllib IDFModel | Closed | Unassigned | |
6.
|
ML model broadcasts should be stored in private vars: mllib GeneralizedLinearModel | Closed | Unassigned |