Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-89

Fuzzy searches are case sensitive

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 1.2
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
      None
    • Environment:

      Operating System: other
      Platform: All

    • Bugzilla Id:
      18014

      Description

      I've found that fuzzy search terms are case sensitive. For example, "Adagio" is calculated as having a levenshtein distance of 1 from "adagio". Of course, "ADAGIO" has a distance of 6, and would not get returned as a search result if searching for 'adagio~'.

      the patch is trivial and I have it here:

          • lucene-1.2\src\java\org\apache\lucene\search\FuzzyTermEnum.java Sun Jun 09 13:47:54 2002
          • patched\src\java\org\apache\lucene\search\FuzzyTermEnum.java Fri Mar 14 11:37:20 2003
            ***************
          • 77,83 ****
            super(reader, term);
            searchTerm = term;
            field = searchTerm.field();
            ! text = searchTerm.text();
            textlen = text.length();
            setEnum(reader.terms(new Term(searchTerm.field(), "")));
            }
          • 77,83 ----
            super(reader, term);
            searchTerm = term;
            field = searchTerm.field();
            ! text = searchTerm.text().toLowerCase();
            textlen = text.length();
            setEnum(reader.terms(new Term(searchTerm.field(), "")));
            }
            ***************
          • 88,94 ****
            */
            final protected boolean termCompare(Term term) {
            if (field == term.field()) {
            ! String target = term.text();
            int targetlen = target.length();
            int dist = editDistance(text, target, textlen, targetlen);
            distance = 1 - ((double)dist / (double)Math.min(textlen, targetlen));
          • 88,94 ----
            */
            final protected boolean termCompare(Term term) {
            if (field == term.field()) {
            ! String target = term.text().toLowerCase();
            int targetlen = target.length();
            int dist = editDistance(text, target, textlen, targetlen);
            distance = 1 - ((double)dist / (double)Math.min(textlen, targetlen));

        Attachments

          Activity

            People

            • Assignee:
              java-dev@lucene.apache.org Lucene Developers
              Reporter:
              cormac@siderean.com Cormac Twomey
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: