Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9289

Speed up Levenshtein distance calculation when we don't need the exact distance

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • modules/spellchecker
    • None
    • New

    Description

      Sometimes when we calculate the Levenshtein distance we don't need the exact distance, we only want to know if the strings are similar enough.

      https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/spelling/SolrSpellChecker.java#L113-L114

      sug.score = sd.getDistance(original, sug.string);        
      if (sug.score < min) continue; 

      If we use this threshold in the distance calculation, we can speed it up, we can stop the calculation when we already know that the the the distance will be lower than the threshold.

      Attachments

        1. SOLR-14360-01.patch
          10 kB
          Andras Salamon

        Activity

          People

            Unassigned Unassigned
            asalamon74 Andras Salamon
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: