Lucene - Core
  1. Lucene - Core
  2. LUCENE-5206

FuzzyQuery: matching terms must be longer than maxEdits

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.5
    • Fix Version/s: 4.5, 6.0
    • Component/s: core/other
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      FuzzyQuery's maxEdit value must be larger than the length of both terms for there to be a match. Based on a response from the java-user list, it looks like I wasn't the only one surprised by this. Let's document this design choice more clearly in the documentation or modify the behavior.

      Apologies if I missed the documentation of this.

        Issue Links

          Activity

          Hide
          Tim Allison added a comment -

          Test cases.

          Show
          Tim Allison added a comment - Test cases.
          Hide
          Michael McCandless added a comment -

          Thanks Tim, I agree we should update the javadocs here ... I'll do that, and add this test.

          These terms actually match the automaton, but then for each match we compute the "scaled distance", in FuzzyTermsEnum.java:

              final float similarity = 1.0f - ((float) ed / (float) (Math.min(codePointCount, termLength)));
          

          And that resulting similarity must be > the minSimilarity (which is >= 0) ... so, indeed as you said the maxEdit must be larger than the length of both terms.

          Show
          Michael McCandless added a comment - Thanks Tim, I agree we should update the javadocs here ... I'll do that, and add this test. These terms actually match the automaton, but then for each match we compute the "scaled distance", in FuzzyTermsEnum.java: final float similarity = 1.0f - (( float ) ed / ( float ) ( Math .min(codePointCount, termLength))); And that resulting similarity must be > the minSimilarity (which is >= 0) ... so, indeed as you said the maxEdit must be larger than the length of both terms.
          Hide
          ASF subversion and git services added a comment -

          Commit 1522733 from Michael McCandless in branch 'dev/trunk'
          [ https://svn.apache.org/r1522733 ]

          LUCENE-5206: fix javadocs to clarify FuzzyQuery's unexpected behaviour on short terms

          Show
          ASF subversion and git services added a comment - Commit 1522733 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1522733 ] LUCENE-5206 : fix javadocs to clarify FuzzyQuery's unexpected behaviour on short terms
          Hide
          ASF subversion and git services added a comment -

          Commit 1522734 from Michael McCandless in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1522734 ]

          LUCENE-5206: fix javadocs to clarify FuzzyQuery's unexpected behaviour on short terms

          Show
          ASF subversion and git services added a comment - Commit 1522734 from Michael McCandless in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1522734 ] LUCENE-5206 : fix javadocs to clarify FuzzyQuery's unexpected behaviour on short terms
          Hide
          ASF subversion and git services added a comment -

          Commit 1522735 from Michael McCandless in branch 'dev/branches/lucene_solr_4_5'
          [ https://svn.apache.org/r1522735 ]

          LUCENE-5206: fix javadocs to clarify FuzzyQuery's unexpected behaviour on short terms

          Show
          ASF subversion and git services added a comment - Commit 1522735 from Michael McCandless in branch 'dev/branches/lucene_solr_4_5' [ https://svn.apache.org/r1522735 ] LUCENE-5206 : fix javadocs to clarify FuzzyQuery's unexpected behaviour on short terms
          Hide
          ASF subversion and git services added a comment -

          Commit 1522736 from Michael McCandless in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1522736 ]

          LUCENE-5206: move CHANGES entry to 4.5

          Show
          ASF subversion and git services added a comment - Commit 1522736 from Michael McCandless in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1522736 ] LUCENE-5206 : move CHANGES entry to 4.5
          Hide
          ASF subversion and git services added a comment -

          Commit 1522737 from Michael McCandless in branch 'dev/trunk'
          [ https://svn.apache.org/r1522737 ]

          LUCENE-5206: move CHANGES entry to 4.5

          Show
          ASF subversion and git services added a comment - Commit 1522737 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1522737 ] LUCENE-5206 : move CHANGES entry to 4.5
          Hide
          Michael McCandless added a comment -

          Thanks Tim!

          Show
          Michael McCandless added a comment - Thanks Tim!
          Hide
          Adrien Grand added a comment -

          4.5 release -> bulk close

          Show
          Adrien Grand added a comment - 4.5 release -> bulk close

            People

            • Assignee:
              Michael McCandless
              Reporter:
              Tim Allison
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development