Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.4.0SDK
-
None
-
None
Description
I am developing a series of Lucene tokenizers which can use UIMA for creating tokens via extracted annotations.
While doing a stress test with lots of different strings I experienced the following:
[junit] Testsuite: org.apache.lucene.analysis.uima.UIMATypeAwareAnalyzerTest [junit] Tests run: 2, Failures: 0, Errors: 1, Time elapsed: 92,061 sec [junit] [junit] ------------- Standard Error ----------------- [junit] The following exceptions were thrown by threads: [junit] *** Thread: Thread-9 *** [junit] java.lang.RuntimeException: java.io.IOException: org.apache.uima.analysis_engine.AnalysisEngineProcessException [junit] at org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:289) [junit] Caused by: java.io.IOException: org.apache.uima.analysis_engine.AnalysisEngineProcessException [junit] at org.apache.lucene.analysis.uima.UIMATypeAwareAnnotationsTokenizer.incrementToken(UIMATypeAwareAnnotationsTokenizer.java:87) [junit] at org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:121) [junit] at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:371) [junit] at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:295) [junit] at org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:287) [junit] Caused by: org.apache.uima.analysis_engine.AnalysisEngineProcessException [junit] at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:701) [junit] at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:409) [junit] at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342) [junit] at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267) [junit] at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267) [junit] at org.apache.lucene.analysis.uima.BaseUIMATokenizer.analyzeInput(BaseUIMATokenizer.java:57) [junit] at org.apache.lucene.analysis.uima.UIMATypeAwareAnnotationsTokenizer.analyzeText(UIMATypeAwareAnnotationsTokenizer.java:73) [junit] at org.apache.lucene.analysis.uima.UIMATypeAwareAnnotationsTokenizer.incrementToken(UIMATypeAwareAnnotationsTokenizer.java:85) [junit] ... 4 more [junit] Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 2 [junit] at java.util.ArrayList.RangeCheck(ArrayList.java:547) [junit] at java.util.ArrayList.get(ArrayList.java:322) [junit] at org.apache.uima.flow.impl.FixedFlowController$FixedFlowObject.next(FixedFlowController.java:216) [junit] at org.apache.uima.analysis_engine.asb.impl.FlowContainer.next(FlowContainer.java:98) [junit] at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:667) [junit] ... 11 more
I'm debugging it and see if I can come up with the exact bug (and fix)