Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-11655

[ML]: OneHotEncoder returns more columns than expected

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.7
    • 2.8
    • ml
    • None
    • OneHotEncoder returns more columns than expected
    • Release Notes Required

    Description

      OneHotEncoder returns more columns than expected (two values that might be encoded using two columns encoded using 3 columns). The following example demonstrates the problem:

      Map<Integer, Object[]> training = new HashMap<>();
      
      training.put(0, new Object[]{42.0});
      training.put(1, new Object[]{43.0});
      training.put(2, new Object[]{42.0});
      
      EncoderTrainer<Integer, Object[]> trainer = new EncoderTrainer<Integer, Object[]>()
          .withEncoderType(EncoderType.ONE_HOT_ENCODER)
          .withEncodedFeature(0);
      
      IgniteBiFunction<Integer, Object[], Vector> processor = trainer.fit(training, 1, (k, v) -> v);
      Vector res = processor.apply(1, new Object[]{42.0});
      
      System.out.println(Arrays.toString(res.asArray()));
      
      >>> [0.0, 1.0, 0.0]
      

      Attachments

        Issue Links

          Activity

            People

              zaleslaw Alexey Zinoviev
              dmitrievanthony Anton Dmitriev
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m