Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
1.3.0
-
None
Description
This issue was brought up by prudenko in this JIRA .
*Proposal*:
When Pipeline.fit is given an array of ParamMaps, it should operate incrementally:
- For each set of parameters applicable to the first PipelineStage,
- Fit/transform that stage using that set of parameters.
- For each set of parameters applicable to the second PipelineStage,
- etc.
This is essentially a depth-first search on the parameters, where each node/level in the search tree is a PipelineStage and each node's child nodes correspond to the set of ParamMaps for that PipelineStage.
This will avoid recomputing intermediate RDDs during model search.
Attachments
Issue Links
- Is contained by
-
SPARK-19071 Optimizations for ML Pipeline Tuning
- Resolved