Description
KoreanTokenizer#testRandomHugeString failed in CI with the following exception:
[junit4] > Throwable #1: java.lang.AssertionError [junit4] > at __randomizedtesting.SeedInfo.seed([8C5E2BE10F581CB:90E6857D4E833D83]:0) [junit4] > at org.apache.lucene.analysis.ko.KoreanTokenizer.add(KoreanTokenizer.java:334) [junit4] > at org.apache.lucene.analysis.ko.KoreanTokenizer.parse(KoreanTokenizer.java:707) [junit4] > at org.apache.lucene.analysis.ko.KoreanTokenizer.incrementToken(KoreanTokenizer.java:377) [junit4] > at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:748) [junit4] > at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:659) [junit4] > at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:561) [junit4] > at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:474) [junit4] > at org.apache.lucene.analysis.ko.TestKoreanTokenizer.testRandomHugeStrings(TestKoreanTokenizer.java:313) [junit4] > at java.lang.Thread.run(Thread.java:748) [junit4] 2> NOTE: leaving temporary files
I am able to reproduce locally with:
ant test -Dtestcase=TestKoreanTokenizer -Dtests.method=testRandomHugeStrings -Dtests.seed=8C5E2BE10F581CB -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.7/test-data/enwiki.random.lines.txt -Dtests.locale=uk-UA -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
After some investigation I found out that the position of the buffer is not updated when the maximum backtrace size is reached (1024).