Commons Lang
  1. Commons Lang
  2. LANG-106

[lang] StringUtils#getLevenshteinDistance() performance is sub-optimal

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.2
    • Component/s: lang.*
    • Labels:
      None
    • Environment:

      Operating System: other
      Platform: Other

      Description

      The implementation of Commons Lang StringUtils#getLevenshteinDistance(String,
      String) is based on work from <http://www.merriampark.com/ld.htm>. While this
      implementation works, it is very memory hungry and can thus slow down heavy
      computations (GC has much more to collect in memory-constrained environment).
      Actual implementation needs x*y byte of memory.

      An improved implementation can be found at
      <http://www.merriampark.com/ldjava.htm>, which can lead to performance
      improvements of up to 3 times (my own internal benchmarks in low-memory
      situation). This new implementation needs x+y bytes of memory.

      Please change the getLevenshteinDistance() implementation to use the one at
      <http://www.merriampark.com/ldjava.htm>.

        Activity

        Hide
        Henri Yandell added a comment -

        Sending src/java/org/apache/commons/lang/StringUtils.java
        Transmitting file data .
        Committed revision 385745.

        Show
        Henri Yandell added a comment - Sending src/java/org/apache/commons/lang/StringUtils.java Transmitting file data . Committed revision 385745.
        Hide
        Henri Yandell added a comment -

        Code applied. Unit tests pass.

        I'll drop an email to both Chas Emerick and Michael Gilleland to let them know
        the code has finally gone in - my fault two and a half years ago.

        Show
        Henri Yandell added a comment - Code applied. Unit tests pass. I'll drop an email to both Chas Emerick and Michael Gilleland to let them know the code has finally gone in - my fault two and a half years ago.
        Hide
        Henri Yandell added a comment -

        Here's the history:
        http://mail-archives.apache.org/mod_mbox/jakarta-commons-dev/200310.mbox/%3CA34BA4886ADBBD4A804452178366659B160B14@solidus-fs1.solidusnetworks.com%3E

        Not sure why it didn't get applied - either I dropped the ball, or I was waiting
        for a second reply.

        Show
        Henri Yandell added a comment - Here's the history: http://mail-archives.apache.org/mod_mbox/jakarta-commons-dev/200310.mbox/%3CA34BA4886ADBBD4A804452178366659B160B14@solidus-fs1.solidusnetworks.com%3E Not sure why it didn't get applied - either I dropped the ball, or I was waiting for a second reply.
        Hide
        Cédrik LIME added a comment -

        Created an attachment (id=17859)
        Source code for new implementation

        In case I didn't get the pach file right...

        Show
        Cédrik LIME added a comment - Created an attachment (id=17859) Source code for new implementation In case I didn't get the pach file right...
        Hide
        Cédrik LIME added a comment -

        Created an attachment (id=17858)
        patch file

        Diff between original StringUtils.java and my patched one.

        Show
        Cédrik LIME added a comment - Created an attachment (id=17858) patch file Diff between original StringUtils.java and my patched one.
        Hide
        ggregory@seagullsw.com added a comment -

        Cédrik: Please feel free to provide a diff patch.

        Show
        ggregory@seagullsw.com added a comment - Cédrik: Please feel free to provide a diff patch.

          People

          • Assignee:
            Unassigned
            Reporter:
            Cédrik LIME
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development