Qpid
  1. Qpid
  2. QPID-4517

[Java Broker] default log4j configuration outputs line numbers, which is a performance drag

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.20
    • Fix Version/s: 0.21
    • Component/s: Java Broker
    • Labels:
      None

      Description

      By default we produce this sort of logging:

      2012-12-20 09:52:51,490 INFO  [main] (AbstractConfiguration.java:124) - Tag:'path'
      

      Including line numbers is slow, as described by the log4j documentation (http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html):

      "WARNING Generating caller location information is extremely slow and should be avoided unless execution speed is not an issue."

        Activity

        Hide
        Philip Harvey added a comment -

        changes committed - please can you hit the Accept button if you're happy

        Show
        Philip Harvey added a comment - changes committed - please can you hit the Accept button if you're happy
        Hide
        Robbie Gemmell added a comment -

        I'm fine with it being either way, I assigned it back to you so you could actually make the change (my comment could have perhaps been clearer on that ).

        Show
        Robbie Gemmell added a comment - I'm fine with it being either way, I assigned it back to you so you could actually make the change (my comment could have perhaps been clearer on that ).
        Hide
        Philip Harvey added a comment -

        Attached perf test config and results corresponding to my the comments I added a few minutes ago.

        Show
        Philip Harvey added a comment - Attached perf test config and results corresponding to my the comments I added a few minutes ago.
        Hide
        Philip Harvey added a comment -

        I tested this log format change using a persistent store, with one producer and one consumer and a transaction batch size of one, again with logger org.apache.qpid set to DEBUG. I found that it still causes a significant performance improvement - throughput doubled on my laptop.

        My laptop is a cheap dual core machine with an SSD drive, so it probably exaggerates the impact of this change, due to its fast IO and slow processor. Nevertheless, these results suggest to me that typical production deployments would see a significant performance increase when using DEBUG logging.

        When the log level is INFO or above, there is, as expected, no measurable performance improvement.

        The scenario I have in mind is when a production support team want to switch on DEBUG logging for a few minutes to help pin down the source of a performance problem, and obviously want to minimise the impact of this logging. It is frustrating when support teams are so fearful of the performance impact of debug-level logging that they resist enabling it, even if it would help solve a problem.

        Regarding line numbers... on the whole, our log messages are sufficiently descriptive to allow us to locate the line of code that produced them, without requiring file names and line numbers. In my short time working on Qpid, I've never had to resort to using line numbers (in fact, they can be positively misleading in cases where we've written our own centralised logging functions). Any logging that is untraceable without an explicit file name and line number is obviously inadequate and should be fixed.

        My view is that the performance increase of this change offsets the lack of line numbers in the log file. However, I accept that this is a trade-off so won't push this change if the majority disagree with me.

        Show
        Philip Harvey added a comment - I tested this log format change using a persistent store, with one producer and one consumer and a transaction batch size of one, again with logger org.apache.qpid set to DEBUG. I found that it still causes a significant performance improvement - throughput doubled on my laptop. My laptop is a cheap dual core machine with an SSD drive, so it probably exaggerates the impact of this change, due to its fast IO and slow processor. Nevertheless, these results suggest to me that typical production deployments would see a significant performance increase when using DEBUG logging. When the log level is INFO or above, there is, as expected, no measurable performance improvement. The scenario I have in mind is when a production support team want to switch on DEBUG logging for a few minutes to help pin down the source of a performance problem, and obviously want to minimise the impact of this logging. It is frustrating when support teams are so fearful of the performance impact of debug-level logging that they resist enabling it, even if it would help solve a problem. Regarding line numbers... on the whole, our log messages are sufficiently descriptive to allow us to locate the line of code that produced them, without requiring file names and line numbers. In my short time working on Qpid, I've never had to resort to using line numbers (in fact, they can be positively misleading in cases where we've written our own centralised logging functions). Any logging that is untraceable without an explicit file name and line number is obviously inadequate and should be fixed. My view is that the performance increase of this change offsets the lack of line numbers in the log file. However, I accept that this is a trade-off so won't push this change if the majority disagree with me.
        Hide
        Robbie Gemmell added a comment -

        Seems reasonable enough. I guess the question is whether its harder to analyse the messages without the file and line numbers and if so whether any performance increase offsets that.

        As you say, its unlikely to offer anywhere near that improvement in a real world use, but it would be interesting to know how much. I imagine persistence would near completely nullify it, and I guess the worst case scenario of connection/session-per-message type usage generating lots of logging is another interesting one, but on the flip side that will also have the effect of making network I/O dominate performance during connection/session creation.

        Show
        Robbie Gemmell added a comment - Seems reasonable enough. I guess the question is whether its harder to analyse the messages without the file and line numbers and if so whether any performance increase offsets that. As you say, its unlikely to offer anywhere near that improvement in a real world use, but it would be interesting to know how much. I imagine persistence would near completely nullify it, and I guess the worst case scenario of connection/session-per-message type usage generating lots of logging is another interesting one, but on the flip side that will also have the effect of making network I/O dominate performance during connection/session creation.
        Hide
        Philip Harvey added a comment -

        Robbie - what do you think about this change?

        Informal testing showed a x2 throughput improvement when using transient messages and with DEBUG logging enabled.

        Obviously in most production deployments the speed-up will be much smaller.

        Show
        Philip Harvey added a comment - Robbie - what do you think about this change? Informal testing showed a x2 throughput improvement when using transient messages and with DEBUG logging enabled. Obviously in most production deployments the speed-up will be much smaller.
        Hide
        Philip Harvey added a comment -

        attached patch

        Show
        Philip Harvey added a comment - attached patch

          People

          • Assignee:
            Robbie Gemmell
            Reporter:
            Philip Harvey
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development