Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.4
-
None
Description
Because of some backwards compatibility problems (LUCENE-1906) we changed the CharStream/CharFilter API a little bit. Tokenizer now only has a input field of type java.io.Reader (as before the CharStream code). To correct offsets, it is now needed to call the Tokenizer.correctOffset(int) method, which delegates to the CharStream (if input is subclass of CharStream), else returns an uncorrected offset. Normally it is enough to change all occurences of input.correctOffset() to this.correctOffset() in Tokenizers. It should also be checked, if custom Tokenizers in Solr do correct their offsets.
Attachments
Attachments
Issue Links
- relates to
-
LUCENE-1906 Backwards problems with CharStream and Tokenizers with custom reset(Reader) method
- Closed
-
SOLR-1404 Random failures with highlighting
- Closed