Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
New
Description
When ShingleFilter hits a hole, it uses _ as the token, e.g. bigrams for "the dog barked", if you have a StopFilter removing the, would be: "_ dog", "dog barked".
But if the input ends with a stopword, e.g. "wizard of", ShingleFilter fails to produce "wizard _" due to LUCENE-3849 ... once we fix that I think we should fix ShingleFilter to make shingles for trailing holes too ...
Attachments
Attachments
Issue Links
- is blocked by
-
LUCENE-3849 position increments should be implemented by TokenStream.end()
- Closed