Details
-
New Feature
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
Docs Required, Release Notes Required
Description
Main goal:
To reduce the gap between Apache Spark and Apache Ignite in preprocessing operations. The reducing of the gap could help with loading Spark ML Pipelines to Ignite ML.
Next steps:
- Add Frequency Encoder
- Add two Imputing Strategies (MIN, MAX, COUNT, MOST_FREQUENT, LEAST_FREQUENT)
- Add RobustScaler (will be added in Spark 3.0)
- Add CountVectorizer
- Add FeatureHasher
- Add QuantileDiscretizer
- Add Locality Sensitive Hashing (LSH)
- Add LabelEncoder
- Add RevertStringIndexing
- Add multi-column preprocessor
Attachments
1.
|
[ML] Add Frequency Encoding | Resolved | Alexey Zinoviev |
|
||||||||
2.
|
[ML] Add support of the additional Imputing Strategies | Resolved | Alexey Zinoviev |
|