Description
Add option for allowing unknown categories, probably via a parameter like "allowUnknownCategories."
If true, then handle unknown categories during transform by assigning them to an extra category index.
The API should resemble the API used for StringIndexer.
Attachments
Issue Links
- contains
-
SPARK-13846 VectorIndexer output on unknown feature should be more descriptive
- Resolved
- is duplicated by
-
SPARK-12367 NoSuchElementException during prediction with Random Forest Regressor
- Closed
- is related to
-
SPARK-22521 VectorIndexerModel support handle unseen categories via handleInvalid: Python API
- Resolved
- links to