Uploaded image for project: 'SAMOA'
  1. SAMOA
  2. SAMOA-44

NPE when running VHT on KDD cup data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: To Do
    • Major
    • Resolution: Unresolved
    • SAMOA-API
    • None

    Description

      From the mailing list:

      We were able to run HoeffdingTree Algorithm on the KDD Cup 99 (both on kddcup_full.arff, kddcup_10_percent.arff) data set. VerticalHoeffdingTree classifier also works fine on kddcup_10_percent.arff. However, when we try to run the VerticalHoeffdingTree classifier on kddcup_full.arff, we got the following error:

      The command we use to run SAMOA Local:

      bin/samoa local target/SAMOA-Local-0.3.0-SNAPSHOT.jar "PrequentialEvaluation -i -1 -f 41920 -l (com.yahoo.labs.samoa.learners.classifiers.trees.VerticalHoeffdingTree -p 4) -s (com.yahoo.labs.samoa.moa.streams.ArffFileStream -f kddcup_full.arff)"

      The console output of samoa:

      bin/samoa
      Deploying to LOCAL
      Command line string = PrequentialEvaluation -i -1 -f 41920 -l (com.yahoo.labs.samoa.learners.classifiers.trees.VerticalHoeffdingTree -p 4) -s (com.yahoo.labs.samoa.moa.streams.ArffFileStream -f kddcup_full.arff)
      2015-09-01 22:22:16,160 [main] INFO com.yahoo.labs.samoa.LocalDoTask (LocalDoTask.java:79) - Successfully instantiating com.yahoo.labs.samoa.tasks.PrequentialEvaluation
      2015-09-01 22:22:17,741 [main] INFO com.yahoo.labs.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:86) - 1 seconds for 41920 instances
      2015-09-01 22:22:17,760 [main] INFO com.yahoo.labs.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:172) - evaluation instances = 41,920
      classified instances = 41,920
      classifications correct (percent) = 99.988
      Kappa Statistic (percent) = -0.002
      Kappa Temporal Statistic (percent) = 28.571
      Exception in thread "main" java.lang.NullPointerException
      at com.yahoo.labs.samoa.learners.classifiers.trees.ModelAggregatorProcessor.process(ModelAggregatorProcessor.java:145)
      at com.yahoo.labs.samoa.topology.impl.SimpleProcessingItem.processEvent(SimpleProcessingItem.java:84)
      at com.yahoo.labs.samoa.topology.impl.SimpleStream.put(SimpleStream.java:71)
      at com.yahoo.labs.samoa.topology.impl.SimpleStream.put(SimpleStream.java:60)
      at com.yahoo.labs.samoa.learners.classifiers.trees.FilterProcessor.process(FilterProcessor.java:95)
      at com.yahoo.labs.samoa.topology.impl.SimpleProcessingItem.processEvent(SimpleProcessingItem.java:84)
      at com.yahoo.labs.samoa.topology.impl.SimpleStream.put(SimpleStream.java:71)
      at com.yahoo.labs.samoa.topology.impl.SimpleStream.put(SimpleStream.java:60)
      at com.yahoo.labs.samoa.topology.LocalEntranceProcessingItem.injectNextEvent(LocalEntranceProcessingItem.java:46)
      at com.yahoo.labs.samoa.topology.LocalEntranceProcessingItem.startSendingEvents(LocalEntranceProcessingItem.java:66)
      at com.yahoo.labs.samoa.topology.impl.SimpleTopology.run(SimpleTopology.java:42)
      at com.yahoo.labs.samoa.topology.impl.SimpleEngine.submitTopology(SimpleEngine.java:33)
      at com.yahoo.labs.samoa.LocalDoTask.main(LocalDoTask.java:87)

      We were able to track down the problem to the first instance that causes it; the instance is on the 76426th line in kddcup_full.arff. The instance is as follows:

      1,tcp,smtp,SF,2252,331,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,7,0,0,0,0,1,0,1,5,216,1,0,0.2,0.01,0,0,0,0,normal

      We haven’t noticed any differences between the problematic instance and the other instances. Could you lead us to the root of the problem and could you help us on how to overcome this problem?

      As a workaround we’ve made the following addition to ModelAggregatorProcessor.java
      if (leafNode == null)
      return false;

      after the line

      ActiveLearningNode leafNode = (ActiveLearningNode) foundNode.getNode();

      Now, also VeriticalHoeffdingTree Classifier works fine on kddcup_full.arff. Is this solution acceptable for the problem, what do you think?

      Attachments

        Activity

          People

            Unassigned Unassigned
            azaroth Gianmarco De Francisci Morales
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: