Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10317

In PhraseQuery API, the explanation of getSlop is not inexact but could be more clear

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Trivial
    • Resolution: Not A Bug
    • 5.2.1
    • None
    • core/search
    • None
    • New

    Description

      The explanation says that searching for "quick fox" will match the document "the fox is quick" with a slop of 3.

      That's true if the stop word "is" is not removed by the analyzer at indexing but, with the standard stop word list of Lucene which includes "is", a slop of 2 is enough.

      As I understand the comment in the PhraseQuery source, switching the order of two words requires two moves (the first places the words atop one another) and the slop is 2, but, if "is" is not removed, a third "move" is needed to add "is" itself and the slop is 3. I am not sure of this explanation. I would be happy to have it confirmed ... or not.

      I tested both cases in Lucene 5.2.1 but the text is the same in PhraseQuery API 8_0_0.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            claudelepere Claude Lepère
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: