Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-12079

[ML][Umbrella] Add advanced preprocessing techniques

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • ml
    • None
    • Docs Required, Release Notes Required

    Description

      Main goal:

      To reduce the gap between Apache Spark and Apache Ignite in preprocessing operations. The reducing of the gap could help with loading Spark ML Pipelines to Ignite ML.

       

      Next steps:

      1. Add Frequency Encoder
      2. Add two Imputing Strategies (MIN, MAX, COUNT, MOST_FREQUENT, LEAST_FREQUENT)
      3. Add RobustScaler (will be added in Spark 3.0)
      4. Add CountVectorizer
      5. Add FeatureHasher
      6. Add QuantileDiscretizer
      7. Add Locality Sensitive Hashing (LSH)
      8. Add LabelEncoder
      9. Add RevertStringIndexing
      10. Add multi-column preprocessor

      Attachments

        There are no Sub-Tasks for this issue.

        Activity

          People

            zaleslaw Alexey Zinoviev
            zaleslaw Alexey Zinoviev
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 40m
                40m