Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6435

java.util.ConcurrentModificationException: Removal from the cache failed error in SimpleNaiveBayesClassifier

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.1
    • Fix Version/s: 6.0
    • Component/s: modules/classification
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      While using SimpleNaiveBayesClassifier on a very large index (all Italian Wikipedia articles) I see the following code triggering a ConcurrentModificationException when evicting the Query from the LRUCache.

      BooleanQuery booleanQuery = new BooleanQuery();
          BooleanQuery subQuery = new BooleanQuery();
          for (String textFieldName : textFieldNames) {
            subQuery.add(new BooleanClause(new TermQuery(new Term(textFieldName, word)), BooleanClause.Occur.SHOULD));
          }
          booleanQuery.add(new BooleanClause(subQuery, BooleanClause.Occur.MUST));
          booleanQuery.add(new BooleanClause(new TermQuery(new Term(classFieldName, c)), BooleanClause.Occur.MUST));
          //...
          TotalHitCountCollector totalHitCountCollector = new TotalHitCountCollector();
          indexSearcher.search(booleanQuery, totalHitCountCollector);
          return totalHitCountCollector.getTotalHits();
      

      this is the complete stacktrace:

      java.util.ConcurrentModificationException: Removal from the cache failed! This is probably due to a query which has been modified after having been put into  the cache or a badly implemented clone(). Query class: [class org.apache.lucene.search.BooleanQuery], query: [#text:panoram #cat:1356]
      	at __randomizedtesting.SeedInfo.seed([B6513DEC3681FEF5:138235BE33532634]:0)
      	at org.apache.lucene.search.LRUQueryCache.evictIfNecessary(LRUQueryCache.java:285)
      	at org.apache.lucene.search.LRUQueryCache.putIfAbsent(LRUQueryCache.java:268)
      	at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:569)
      	at org.apache.lucene.search.ConstantScoreWeight.scorer(ConstantScoreWeight.java:82)
      	at org.apache.lucene.search.Weight.bulkScorer(Weight.java:137)
      	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:560)
      	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:367)
      	at org.apache.lucene.classification.SimpleNaiveBayesClassifier.getWordFreqForClass(SimpleNaiveBayesClassifier.java:288)
      	at org.apache.lucene.classification.SimpleNaiveBayesClassifier.calculateLogLikelihood(SimpleNaiveBayesClassifier.java:248)
      	at org.apache.lucene.classification.SimpleNaiveBayesClassifier.assignClassNormalizedList(SimpleNaiveBayesClassifier.java:169)
      	at org.apache.lucene.classification.SimpleNaiveBayesClassifier.assignClass(SimpleNaiveBayesClassifier.java:125)
      	at org.apache.lucene.classification.WikipediaTest.testItalianWikipedia(WikipediaTest.java:126)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
      	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
      	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
      	at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
      	at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
      	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
      	at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
      	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
      	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
      	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
      	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
      	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
      	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
      	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
      	at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
      	at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
      	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
      	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
      	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
      	at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
      	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
      	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
      	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
      	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
      	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
      	at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
      	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
      	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
      	at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
      	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
      	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
      	at java.lang.Thread.run(Thread.java:745)
      

      The strange thing is that the above doesn't happen if I change the last lines of the above piece of code to not use the TotalHitCountsCollector:

      return indexSearcher.search(booleanQuery, 1).totalHits;
      

        Attachments

        1. patch.rtf
          2 kB
          Chang KaiShin
        2. LUCENE-6435.patch
          1.0 kB
          Adrien Grand

          Activity

            People

            • Assignee:
              teofili Tommaso Teofili
              Reporter:
              teofili Tommaso Teofili
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: