-
Type:
Improvement
-
Status: Open
-
Priority:
Minor
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: None
-
Component/s: modules/analysis
-
Labels:None
-
Lucene Fields:New, Patch Available
The CharFilter API lets you wrap a Reader, altering the contents before the Tokenizer sees them.
It also allows you to correct the offsets so this is transparent to highlighting.
One problem is that the API isn't reusable, if you have a lot of short documents its going to be efficient.
Additionally there is some unnecessary wrapping in Tokenizer (see the CharReader.get in the ctor, but not in reset(Reader)!!!)
- is related to
-
LUCENE-4228 Refactor CharFilter to be a java.io.FilterReader
-
- Closed
-