Description
OneHotEncoder should be an Estimator, just like in scikit-learn (http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html).
In its current form, it is impossible to use when number of categories is different between training dataset and test dataset.
Attachments
Issue Links
- Is contained by
-
SPARK-8418 Add single- and multi-value support to ML Transformers
- Resolved
- is related to
-
SPARK-23008 OnehotEncoderEstimator python API
- Resolved
-
SPARK-21926 Compatibility between ML Transformers and Structured Streaming
- Resolved
- links to