Description
It would be handy to add a binary toggle Param to CountVectorizer, as in the scikit-learn one: http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html
If set, then all non-zero counts will be set to 1.
Attachments
Issue Links
- relates to
-
SPARK-13963 Add binary toggle Param to ml.HashingTF
- Resolved
-
SPARK-13967 Add binary toggle Param to PySpark CountVectorizer
- Resolved
-
SPARK-14392 CountVectorizer Estimator should include binary toggle Param
- Resolved
- links to