Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1, 1.2
    • Component/s: None
    • Labels:
      None

      Description

      Building the messages to write to the console and the log file can be done outside the couch_log gen_event. This significantly increases the parallelism when many processes log messages. The following relaximation test graph shows a throughput increase using the attached patch against current trunk:

      http://graphs.mikeal.couchone.com/#/graph/0379dbdaef29b1c0fbf03421540243f7

      1. COUCHDB-1054-3.patch
        5 kB
        Filipe Manana
      2. COUCHDB-1054-2.patch
        5 kB
        Filipe Manana
      3. COUCHDB-1054.patch
        5 kB
        Filipe Manana

        Activity

        Hide
        Filipe Manana added a comment -

        Simpler version of the patch.
        Also gets rid of the useless all possible EOLs substitution with CRLF.

        http://graphs.mikeal.couchone.com/#/graph/0379dbdaef29b1c0fbf0342154029698

        Show
        Filipe Manana added a comment - Simpler version of the patch. Also gets rid of the useless all possible EOLs substitution with CRLF. http://graphs.mikeal.couchone.com/#/graph/0379dbdaef29b1c0fbf0342154029698
        Hide
        Randall Leeds added a comment -

        Reviewed the patch and it looks good. Nice performance win!
        Would if we added the timestamp in get_log_message?
        It might make some log messages appear out of order, but it would keep the timestamp closer to the real time of the event.
        Could the timestamps ever provide more information about inter-related gen_server crashes? Information which is not clear from the backtrace, the last message received, or the supervision tree?
        Also, fewer io:format maybe creates less garbage?
        I don't feel strongly either way, but I thought I'd pick it out and ask your opinion since the rest of the patch is unquestionably solid.

        Show
        Randall Leeds added a comment - Reviewed the patch and it looks good. Nice performance win! Would if we added the timestamp in get_log_message? It might make some log messages appear out of order, but it would keep the timestamp closer to the real time of the event. Could the timestamps ever provide more information about inter-related gen_server crashes? Information which is not clear from the backtrace, the last message received, or the supervision tree? Also, fewer io:format maybe creates less garbage? I don't feel strongly either way, but I thought I'd pick it out and ask your opinion since the rest of the patch is unquestionably solid.
        Hide
        Adam Kocoloski added a comment -

        Nice find. A few suggestions:

        1) Have you tried using io:put_chars/2 instead of io:format in the log() function? Now that the arguments are simple strings we don't need all the printf goop there.

        2) If the output of get_log_message() is longer than 64 characters (and it usually is) it may make sense to do a list_to_binary there and send a refc binary to the logger so we don't copy the data.

        Show
        Adam Kocoloski added a comment - Nice find. A few suggestions: 1) Have you tried using io:put_chars/2 instead of io:format in the log() function? Now that the arguments are simple strings we don't need all the printf goop there. 2) If the output of get_log_message() is longer than 64 characters (and it usually is) it may make sense to do a list_to_binary there and send a refc binary to the logger so we don't copy the data.
        Hide
        Filipe Manana added a comment -

        Yes, the addition of the timestamp is not externalized to avoid having strange logs, with later timestamps coming before earlier ones. If externalized, it would likely confuse many users. However I don't mind at all about it.

        Is everyone ok with this?

        At some point I had the message converted to a binary with an iolist_to_binary call, it's certainly difficult to measure the impact, but I'm not sure what is more expensive: converting to a binary or copying an iolist.
        I didn't know about io:put_chars, but it seems to accept binaries and iolists besides charlists (the docs only mention charlists). I'll definitely use it.

        Show
        Filipe Manana added a comment - Yes, the addition of the timestamp is not externalized to avoid having strange logs, with later timestamps coming before earlier ones. If externalized, it would likely confuse many users. However I don't mind at all about it. Is everyone ok with this? At some point I had the message converted to a binary with an iolist_to_binary call, it's certainly difficult to measure the impact, but I'm not sure what is more expensive: converting to a binary or copying an iolist. I didn't know about io:put_chars, but it seems to accept binaries and iolists besides charlists (the docs only mention charlists). I'll definitely use it.
        Hide
        Volker Mische added a comment -

        I'm definitely for timestamps. When you have a long running session with some event you are interested in in between, it's easier to find the corresponding message (even if they are out of order).

        Show
        Volker Mische added a comment - I'm definitely for timestamps. When you have a long running session with some event you are interested in in between, it's easier to find the corresponding message (even if they are out of order).
        Hide
        Adam Kocoloski added a comment -

        Hi Volker, Filipe was only addressing whether the timestamp should be inserted by the client process or by the logger itself. If done by the client process it's possible that timestamps may not monotonically increase. We'll certainly continue to insert timestamps in the log files.

        Show
        Adam Kocoloski added a comment - Hi Volker, Filipe was only addressing whether the timestamp should be inserted by the client process or by the logger itself. If done by the client process it's possible that timestamps may not monotonically increase. We'll certainly continue to insert timestamps in the log files.
        Hide
        Filipe Manana added a comment -

        Volker, I don't fully understand your comment. What do you mean with "I'm definitely for timestamps"? Timestamps were always present. Or do you mean that having apparently out of order timestamps is not an issue for you?

        Show
        Filipe Manana added a comment - Volker, I don't fully understand your comment. What do you mean with "I'm definitely for timestamps"? Timestamps were always present. Or do you mean that having apparently out of order timestamps is not an issue for you?
        Hide
        Filipe Manana added a comment -

        Updated patch.
        Adding the timestamp outside the event handler and using io:put_chars/[1,2].

        Relaximation graph:

        http://graphs.mikeal.couchone.com/#/graph/0379dbdaef29b1c0fbf034215402a52e

        While running this test (100 readers, 100 writers), all log entries have monotonically increasing timestamps. Their resolution is at the second level, so it shouldn't be very common to get 2 consecutive log messages without increasing timestamps.

        Any objection for 1.1.x inclusion (besides trunk)?

        Show
        Filipe Manana added a comment - Updated patch. Adding the timestamp outside the event handler and using io:put_chars/ [1,2] . Relaximation graph: http://graphs.mikeal.couchone.com/#/graph/0379dbdaef29b1c0fbf034215402a52e While running this test (100 readers, 100 writers), all log entries have monotonically increasing timestamps. Their resolution is at the second level, so it shouldn't be very common to get 2 consecutive log messages without increasing timestamps. Any objection for 1.1.x inclusion (besides trunk)?
        Hide
        Sebastian Cohnen added a comment -

        Although I don't exactly know what Filipe improved here, this is not about the memory consumption problems with logging on debug e.g., right?

        Show
        Sebastian Cohnen added a comment - Although I don't exactly know what Filipe improved here, this is not about the memory consumption problems with logging on debug e.g., right?
        Hide
        Adam Kocoloski added a comment -

        Hi Sebastian, I expect it will help with that somewhat.

        Show
        Adam Kocoloski added a comment - Hi Sebastian, I expect it will help with that somewhat.
        Hide
        Adam Kocoloski added a comment -

        @Filipe no objections from me.

        Show
        Adam Kocoloski added a comment - @Filipe no objections from me.
        Hide
        Jan Lehnardt added a comment -

        Filipe, looks good, no objections for 1.1.x

        Show
        Jan Lehnardt added a comment - Filipe, looks good, no objections for 1.1.x
        Hide
        Filipe Manana added a comment -

        Applied to trunk and 1.1.x

        Show
        Filipe Manana added a comment - Applied to trunk and 1.1.x
        Hide
        Randall Leeds added a comment -

        Just saw this got committed but I had one more thought: do we save any extra conversion and binary creation by using the binary of ConsoleMsg as the second element of FileMsg?

        Show
        Randall Leeds added a comment - Just saw this got committed but I had one more thought: do we save any extra conversion and binary creation by using the binary of ConsoleMsg as the second element of FileMsg?

          People

          • Assignee:
            Filipe Manana
            Reporter:
            Filipe Manana
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development