Uploaded image for project: 'Commons Text'
  1. Commons Text
  2. TEXT-188

Speed up LevenshteinDistance with threshold

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.9
    • 1.10.0
    • None

    Description

      The calculation made by the LevenshteinDistance class can often be made faster, when the class in initialized with a threshold, and when the distance is found to be larger than the threshold. In those cases, it is often not necessary to iterate through the whole string, since a lower bound for the result can be established after each iteration. If that lower bound is larger than the threshold, the method can simply exit early with the same result as without this improvement. 

      A patch with the proposed change is attached to this issue.

      Attachments

        1. improvement.patch
          1 kB
          Jakob Vesterstrøm

        Activity

          People

            kinow Bruno P. Kinoshita
            vesterstroem Jakob Vesterstrøm
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 40m
                1h 40m