[LUCENE-10317] In PhraseQuery API, the explanation of getSlop is not inexact but could be more clear - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Trivial
Resolution: Not A Bug
Affects Version/s: 5.2.1
Fix Version/s: None
Component/s: core/search
Labels:
None

Lucene Fields:

New

Description

The explanation says that searching for "quick fox" will match the document "the fox is quick" with a slop of 3.

That's true if the stop word "is" is not removed by the analyzer at indexing but, with the standard stop word list of Lucene which includes "is", a slop of 2 is enough.

As I understand the comment in the PhraseQuery source, switching the order of two words requires two moves (the first places the words atop one another) and the slop is 2, but, if "is" is not removed, a third "move" is needed to add "is" itself and the slop is 3. I am not sure of this explanation. I would be happy to have it confirmed ... or not.

I tested both cases in Lucene 5.2.1 but the text is the same in PhraseQuery API 8_0_0.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Claude Lepère

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 15/Dec/21 18:17

Updated:: 28/Aug/22 16:32

Resolved:: 16/Dec/21 09:22