Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.5
-
None
-
None
-
New
Description
That is because the field input in class SentenceTokenizer isn't reset after we call the method reset().
They are two input field,one is from Tokenizer and another is from TokenFilter,if we need to reset a tokenstream created by SmartChineseAnalyzer, both them need reset.This bug is because of the author forget reset input field in class SentenceTokenizer .
class path : org.apache.lucene.analysis.cn.smart.SentenceTokenizer
oringal code
public final class SentenceTokenizer extends Tokenizer {
....
@Override
public void reset() throws IOException
...
}
this method should changes as follow
public void reset() throws IOException
{ super.reset(); /*should reset input*/ if (input.markSupported()) input.reset(); tokenStart = tokenEnd = 0; }