Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
New, Patch Available
Description
The CharFilter API lets you wrap a Reader, altering the contents before the Tokenizer sees them.
It also allows you to correct the offsets so this is transparent to highlighting.
One problem is that the API isn't reusable, if you have a lot of short documents its going to be efficient.
Additionally there is some unnecessary wrapping in Tokenizer (see the CharReader.get in the ctor, but not in reset(Reader)!!!)
Attachments
Attachments
Issue Links
- is related to
-
LUCENE-4228 Refactor CharFilter to be a java.io.FilterReader
- Closed