Uploaded image for project: 'Commons Lang'
  1. Commons Lang
  2. LANG-1199

Fix implementation of StringUtils.getJaroWinklerDistance()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.4
    • 3.5
    • lang.*
    • None

    Description

      The current implementation of StringUtils.getJaroWinklerDistance() does not compute the correct result in some cases. See #LANG-944 for the initial code contribution.

      StringUtils.getJaroWinklerDistance("Haus Ingeborg", "Ingeborg Esser") == 0.0

      This is due to the incorrect computation of common characters, which causes the algorithm to exit prematurely.

      In contrast, the implementation in Lucene gives ~0.63, which is about right.

      JaroWinklerDistance d = new JaroWinklerDistance();
      getDistance("Haus Ingeborg", "Ingeborg Esser");

      See https://lucene.apache.org/core/3_0_3/api/contrib-spellchecker/org/apache/lucene/search/spell/JaroWinklerDistance.html

      Attachments

        Issue Links

          Activity

            People

              pascalschumacher Pascal Schumacher
              msteiger M. Steiger
              Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: