[LUCENE-8344] TokenStreamToAutomaton doesn't ignore trailing posInc when preservePositionIncrements=false - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 7.4
Component/s: modules/suggest
Labels:
None

Lucene Fields:

New

Description

TokenStreamToAutomaton in Lucene core is used by the AnalyzingSuggester (incl. FuzzySuggester subclass ) and NRT Document Suggester and soon the SolrTextTagger. It has a setting preservePositionIncrements defaulting to true. If it's set to false (e.g. to ignore stopwords) and if there is a trailing position increment greater than 1, TS2A will still add position increments (holes) into the automata even though it was configured not to.

I'm filing this issue separate from ~~LUCENE-8332~~ where I first found it. The fix is very simple but I'm concerned about back-compat ramifications so I'm filing it separately. I'll attach a patch to show the problem.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-8344.patch
08/Jun/18 22:11
16 kB
David Smiley
LUCENE-8344.patch
04/Jun/18 15:01
7 kB
David Smiley
LUCENE-8344.patch
01/Jun/18 17:12
5 kB
David Smiley

Activity

People

Assignee:: David Smiley

Reporter:: David Smiley

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 01/Jun/18 16:39

Updated:: 28/Aug/22 15:31

Resolved:: 14/Jun/18 03:49