Pipelines can currently contain Estimators and Transformers.
Question for debate: Should Pipelines be able to contain Evaluators?
- Schema check: Evaluators take input datasets with particular schema, which should perhaps be checked before running a Pipeline.
- Intermediate results:
- If a Transformer removes a column (which is not done by built-in Transformers currently but might be reasonable in the future), then the user can never evaluate that column. (However, users could keep all columns around.)
- If users have to evaluate after running a Pipeline, then each evaluated column may have to be re-materialized.
- API: Evaluators do not transform datasets. They produce a scalar (or a few values), which makes it hard to say how they fit into a Pipeline or a PipelineModel.