Commons Lang
  1. Commons Lang
  2. LANG-162

[lang] replace() length calculation improvement

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.1
    • Fix Version/s: 2.2
    • Component/s: None
    • Labels:
      None
    • Environment:

      Operating System: other
      Platform: Other

      Description

      static String replace(String text, String repl, String with, int max)

      PROBLEM:
      This method incurs the calculation of the search string ("repl") every time a
      match is found.

      int start = 0, end = 0;
      while ((end = text.indexOf(repl, start)) != -1)

      { buf.append(text.substring(start, end)).append(with); start = end + repl.length(); ... }

      PROPOSED SOLUTION:
      Move it out of the loop.

      int start = 0, end = 0, replLen = repl.length();
      while ((end = text.indexOf(repl, start)) != -1)

      { buf.append(text.substring(start, end)).append(with); start = end + replLen; ... }

        Activity

        Hide
        ggregory@seagullsw.com added a comment -

        Since a String is immutable, I would hope the compiler if not the VM would
        already do this. IMO, unless this is a proven problem with data from a
        profiler, I do not do this type of optimization.

        Show
        ggregory@seagullsw.com added a comment - Since a String is immutable, I would hope the compiler if not the VM would already do this. IMO, unless this is a proven problem with data from a profiler, I do not do this type of optimization.
        Hide
        Henri Yandell added a comment -

        I'll take a look at it if Paul doesn't mention his evidence for optimisation - mostly just because profiling is
        fun and it'll be interesting to know if it is a speed issue.

        Show
        Henri Yandell added a comment - I'll take a look at it if Paul doesn't mention his evidence for optimisation - mostly just because profiling is fun and it'll be interesting to know if it is a speed issue.
        Hide
        Paul Benedict added a comment -

        I won't hold anyone in suspense. I simply eye-balled this as an optimization
        because I was designing functionality where string replacement has heavy usage;
        so I took a look at the code. It's not critical and I am generally satisified
        with the code as-is, but putting in my 2cents in case it helps.

        Show
        Paul Benedict added a comment - I won't hold anyone in suspense. I simply eye-balled this as an optimization because I was designing functionality where string replacement has heavy usage; so I took a look at the code. It's not critical and I am generally satisified with the code as-is, but putting in my 2cents in case it helps.
        Hide
        Henri Yandell added a comment -

        Created an attachment (id=18153)
        Simple perf test file

        Show
        Henri Yandell added a comment - Created an attachment (id=18153) Simple perf test file
        Hide
        Henri Yandell added a comment -

        Not sure which source you're looking at, the latest HEAD is different, but does still contain the
        nested .length() call. Looking at svn log etc, looks like we have an optimisation in already for the next
        release (see COM-2395).

        I created a performance test (attached), and running it against both pre and post change jars, there's no
        obvious difference in the numbers I'm getting back.

        So marking this as a WONTFIX.

        Show
        Henri Yandell added a comment - Not sure which source you're looking at, the latest HEAD is different, but does still contain the nested .length() call. Looking at svn log etc, looks like we have an optimisation in already for the next release (see COM-2395 ). I created a performance test (attached), and running it against both pre and post change jars, there's no obvious difference in the numbers I'm getting back. So marking this as a WONTFIX.
        Hide
        James Ring added a comment -

        I think it's silly to mark this as a WONTFIX, and the attached test case doesn't
        address the problem at hand.

        The test shows the call to length() being called only once per iteration of the
        main loop. If there are many matches, then length() will be called more times,
        and will add more overhead to the function. Of course, there's calls to
        append(), substring() and indexOf(), which probably dominate the running time of
        the function.

        All the same, why repeat a call multiple times when the result is always going
        to be the same? It seems the Sun Java compiler won't save you, because I haven't
        managed to get it to "optimize out" a repeated call to length() on an immutable
        string.

        The proposed solution is small and sensible and there's no good reason not to
        apply it.

        Show
        James Ring added a comment - I think it's silly to mark this as a WONTFIX, and the attached test case doesn't address the problem at hand. The test shows the call to length() being called only once per iteration of the main loop. If there are many matches, then length() will be called more times, and will add more overhead to the function. Of course, there's calls to append(), substring() and indexOf(), which probably dominate the running time of the function. All the same, why repeat a call multiple times when the result is always going to be the same? It seems the Sun Java compiler won't save you, because I haven't managed to get it to "optimize out" a repeated call to length() on an immutable string. The proposed solution is small and sensible and there's no good reason not to apply it.
        Hide
        Henri Yandell added a comment -

        Created an attachment (id=18216)
        Second attempt at PerfTest

        Attaching a perf test that increases the size of the central text so that the
        number of matches increases.

        Show
        Henri Yandell added a comment - Created an attachment (id=18216) Second attempt at PerfTest Attaching a perf test that increases the size of the central text so that the number of matches increases.
        Hide
        Henri Yandell added a comment -

        Second perf test didn't show any real increase, but given that it's a standard
        optimisation you're right James - might as well apply it. Have done the same to
        replaceChars too.

        "Applying the optimisation advised by Paul Benedict in #39368"
        src/java/org/apache/commons/lang/StringUtils.java
        Sending src/java/org/apache/commons/lang/StringUtils.java
        Transmitting file data .
        Committed revision 399127.

        Show
        Henri Yandell added a comment - Second perf test didn't show any real increase, but given that it's a standard optimisation you're right James - might as well apply it. Have done the same to replaceChars too. "Applying the optimisation advised by Paul Benedict in #39368" src/java/org/apache/commons/lang/StringUtils.java Sending src/java/org/apache/commons/lang/StringUtils.java Transmitting file data . Committed revision 399127.

          People

          • Assignee:
            Unassigned
            Reporter:
            Paul Benedict
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development