Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-13344

[ML] DummyVectorizer fails to extract label for coordinate with value "0.0" when backed by sparse vector

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 2.8.1
    • None
    • ml
    • None
    • Docs Required, Release Notes Required

    Description

      Given: A labeled DummyVectorizer:

       

      new DummyVectorizer<String>()
       .exclude(excludeCoordinates.stream().map(coord -> vectorLength + coord).toArray(Integer[]::new))
       .labeled(labelCoord);
      

      When extracting the label, the call hierarchy eventually ends up at org.apache.ignite.ml.dataset.feature.extractor.impl.DummyVectorizer#feature, which returns null for val.getRaw when val is a sparse vector with the element at the requested label coordinate being 0.0. This causes the training job to fail (which expects a non-null label):

      org.apache.ignite.IgniteException: Remote job threw user exception (override or implement ComputeTask.result(..) method if you would like to have automatic failover for this exception): nullorg.apache.ignite.IgniteException: Remote job threw user exception (override or implement ComputeTask.result(..) method if you would like to have automatic failover for this exception): null at org.apache.ignite.compute.ComputeTaskAdapter.result(ComputeTaskAdapter.java:102) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.task.GridTaskWorker$5.apply(GridTaskWorker.java:1062) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.task.GridTaskWorker$5.apply(GridTaskWorker.java:1055) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7037) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.task.GridTaskWorker.result(GridTaskWorker.java:1055) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:862) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.task.GridTaskProcessor.processJobExecuteResponse(GridTaskProcessor.java:1146) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.job.GridJobWorker.finishJob(GridJobWorker.java:961) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.job.GridJobWorker.finishJob(GridJobWorker.java:809) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:659) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:519) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) ~[ignite-core-2.8.1.jar:2.8.1] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[na:na] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[na:na] at java.base/java.lang.Thread.run(Thread.java:832) ~[na:na]Caused by: org.apache.ignite.IgniteException: null at org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1858) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:596) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7005) ~[ignite-core-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:590) ~[ignite-core-2.8.1.jar:2.8.1] ... 5 common frames omittedCaused by: java.lang.NullPointerException: null at org.apache.ignite.ml.dataset.impl.bootstrapping.BootstrappedDatasetBuilder.build(BootstrappedDatasetBuilder.java:91) ~[ignite-ml-2.8.1.jar:2.8.1] at org.apache.ignite.ml.dataset.impl.bootstrapping.BootstrappedDatasetBuilder.build(BootstrappedDatasetBuilder.java:41) ~[ignite-ml-2.8.1.jar:2.8.1] at org.apache.ignite.ml.dataset.impl.cache.util.ComputeUtils.lambda$getData$4(ComputeUtils.java:239) ~[ignite-ml-2.8.1.jar:2.8.1] at org.apache.ignite.ml.dataset.impl.cache.util.PartitionDataStorage.lambda$computeDataIfAbsent$1(PartitionDataStorage.java:56) ~[ignite-ml-2.8.1.jar:2.8.1] at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708) ~[na:na] at org.apache.ignite.ml.dataset.impl.cache.util.PartitionDataStorage.computeDataIfAbsent(PartitionDataStorage.java:56) ~[ignite-ml-2.8.1.jar:2.8.1] at org.apache.ignite.ml.dataset.impl.cache.util.ComputeUtils.getData(ComputeUtils.java:209) ~[ignite-ml-2.8.1.jar:2.8.1] at org.apache.ignite.ml.dataset.impl.cache.CacheBasedDataset.lambda$compute$c7cefc59$1(CacheBasedDataset.java:172) ~[ignite-ml-2.8.1.jar:2.8.1] at org.apache.ignite.ml.dataset.impl.cache.util.ComputeUtils$DeployableCallable.call(ComputeUtils.java:432) ~[ignite-ml-2.8.1.jar:2.8.1] at org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1855) ~[ignite-core-2.8.1.jar:2.8.1] ... 8 common frames omitted

      Attachments

        Activity

          People

            zaleslaw Alexey Zinoviev
            thilo.ginkel Thilo-Alexander Ginkel
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: