Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
8.8
-
None
-
None
-
New
Description
If asserts are enabled having gaps at the beginning or end of an alternate path can result in assertion errors
ex:
java.lang.AssertionError: 2 at org.apache.lucene.analysis.core.FlattenGraphFilter.releaseBufferedToken(FlattenGraphFilter.java:195)
Or
java.lang.AssertionError at org.apache.lucene.analysis.core.FlattenGraphFilter.releaseBufferedToken(FlattenGraphFilter.java:191)
If asserts are not enabled these the same conditions will result in either IndexOutOfBounds Exceptions, or dropped tokens.
java.lang.ArrayIndexOutOfBoundsException: Index -2 out of bounds for length 8
at org.apache.lucene.util.RollingBuffer.get(RollingBuffer.java:109)
at org.apache.lucene.analysis.core.FlattenGraphFilter.incrementToken(FlattenGraphFilter.java:325)
These issues can be recreated with the following unit tests
public void testAltPathFirstStepHole() throws IOException { TokenStream in = new CannedTokenStream(0, 3, new Token[]{ token("abc",1, 3, 0, 3), token("b",1, 1, 1, 2), token("c",1, 1, 2, 3) }); TokenStream out = new FlattenGraphFilter(in); assertTokenStreamContents(out, new String[]{"abc", "b", "c"}, new int[] {0, 1, 2}, new int[] {3, 2, 3}, new int[] {1, 1, 1}, new int[] {3, 1, 1}, //token 0 may need to be len 1 after flattening 3); }
public void testAltPathLastStepHole() throws IOException { TokenStream in = new CannedTokenStream(0, 4, new Token[]{ token("abc",1, 3, 0, 3), token("a",0, 1, 0, 1), token("b",1, 1, 1, 2), token("d",2, 1, 3, 4) }); TokenStream out = new FlattenGraphFilter(in); assertTokenStreamContents(out, new String[]{"abc", "a", "b", "d"}, new int[] {0, 0, 1, 3}, new int[] {1, 1, 2, 4}, new int[] {1, 0, 1, 2}, new int[] {3, 1, 1, 1}, 4); }
public void testAltPathLastStepHoleWithoutEndToken() throws IOException { TokenStream in = new CannedTokenStream(0, 2, new Token[]{ token("abc",1, 3, 0, 3), token("a",0, 1, 0, 1), token("b",1, 1, 1, 2) }); TokenStream out = new FlattenGraphFilter(in); assertTokenStreamContents(out, new String[]{"abc", "a", "b"}, new int[] {0, 0, 1}, new int[] {1, 1, 2}, new int[] {1, 0, 1}, new int[] {1, 1, 1}, 2); }
I believe Lucene-8723 is a related issue as it looks like the last token in an alternate path is being deleted.