ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-912

ZooKeeper client logs trace and debug messages at level INFO

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.3.1
    • Fix Version/s: 3.5.0
    • Component/s: java client
    • Labels:
      None

      Description

      ZK logs a lot of uninformative trace and debug messages to level INFO. This fuzzes up everything and makes it easy to miss useful log info.

        Activity

        Hide
        Anthony Urso added a comment -

        This patch attempts to lower the log level of debug and trace messages to where they belong.

        Show
        Anthony Urso added a comment - This patch attempts to lower the log level of debug and trace messages to where they belong.
        Hide
        Patrick Hunt added a comment -

        IMO more of an improvement than a bug.

        See this section on logging for our general guidelines (granted it's a very gray area)
        http://hadoop.apache.org/zookeeper/docs/current/zookeeperInternals.html#sc_logging

        I see some issues with this patch, while the original logging does make it a bit more "fuzzy" some of this is pretty critical information. I'd like to give detailed feedback/discussion though. Apache just opened a reviewboard instance, could you post the patch there for review?

        https://reviews.apache.org/dashboard/

        Also please post patches in ZOOKEEPER-###.patch form (also has detail on how to create the patch file itself):
        http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute

        Thanks!

        Show
        Patrick Hunt added a comment - IMO more of an improvement than a bug. See this section on logging for our general guidelines (granted it's a very gray area) http://hadoop.apache.org/zookeeper/docs/current/zookeeperInternals.html#sc_logging I see some issues with this patch, while the original logging does make it a bit more "fuzzy" some of this is pretty critical information. I'd like to give detailed feedback/discussion though. Apache just opened a reviewboard instance, could you post the patch there for review? https://reviews.apache.org/dashboard/ Also please post patches in ZOOKEEPER-###.patch form (also has detail on how to create the patch file itself): http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute Thanks!
        Hide
        Patrick Hunt added a comment -

        I have issues with this patch that I detailed on https://reviews.apache.org/r/7/

        In particular I think you can get the same effect through log4j.properties configuration. (details in my review feedback).

        I'm -1 for making these changes. What do other people think, agree/disagree?

        Show
        Patrick Hunt added a comment - I have issues with this patch that I detailed on https://reviews.apache.org/r/7/ In particular I think you can get the same effect through log4j.properties configuration. (details in my review feedback). I'm -1 for making these changes. What do other people think, agree/disagree?
        Hide
        Anthony Urso added a comment -

        I don't think you understand the problem.

        log4j has different log levels for different severities. Debug messages should be logged at level "debug." Warnings should be logged at level "warn." Errors should be logged at level "error." Zookeeper logs nearly everything at least at level "info," regardless of severity.

        This leads to a situation where the uninformative debug message:

        10/10/27 21:05:09 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session

        Is logged at the same level as the should-be-way-more-noticeable error message:

        10/10/27 21:05:09 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect

        I can use log4j properties to turn off "info" level messages, but then I won't see those warnings and errors.

        If you truly feel that these log messages should be all-or-nothing, why not get rid of log4j entirely and use System.out.println?

        Show
        Anthony Urso added a comment - I don't think you understand the problem. log4j has different log levels for different severities. Debug messages should be logged at level "debug." Warnings should be logged at level "warn." Errors should be logged at level "error." Zookeeper logs nearly everything at least at level "info," regardless of severity. This leads to a situation where the uninformative debug message: 10/10/27 21:05:09 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session Is logged at the same level as the should-be-way-more-noticeable error message: 10/10/27 21:05:09 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect I can use log4j properties to turn off "info" level messages, but then I won't see those warnings and errors. If you truly feel that these log messages should be all-or-nothing, why not get rid of log4j entirely and use System.out.println?
        Hide
        Patrick Hunt added a comment -

        Zookeeper logs nearly everything at least at level "info," regardless of severity.

        that's incorrect. I did a quick grep for logging in the main src and see the following:

        egrep -R "LOG\.error" src/java/main/. |wc -l
        78
        egrep -R "LOG\.warn" src/java/main/. |wc -l
        175
        egrep -R "LOG\.info" src/java/main/. |wc -l
        127
        egrep -R "LOG\.debug" src/java/main/. |wc -l
        114
        egrep -R "LOG\.trace" src/java/main/. |wc -l
        28

        So actually we log mostly at WARN severity. Perhaps you think this because you mainly see INFO messages, but that's to be expected (typically things work, we only log WARN/ERROR when bad things happen).

        I didn't say anything about all/nothing. Check the code, we have a number of messages at various levels, incl trace/debug. If you don't want to see the informational messages for a particular class you can configure that.

        As I pointed out earlier:
        http://hadoop.apache.org/zookeeper/docs/current/zookeeperInternals.html#sc_logging

        We consider both the messages you listed to be informational given that we expect/recover from the second.

        Show
        Patrick Hunt added a comment - Zookeeper logs nearly everything at least at level "info," regardless of severity. that's incorrect. I did a quick grep for logging in the main src and see the following: egrep -R "LOG\.error" src/java/main/. |wc -l 78 egrep -R "LOG\.warn" src/java/main/. |wc -l 175 egrep -R "LOG\.info" src/java/main/. |wc -l 127 egrep -R "LOG\.debug" src/java/main/. |wc -l 114 egrep -R "LOG\.trace" src/java/main/. |wc -l 28 So actually we log mostly at WARN severity. Perhaps you think this because you mainly see INFO messages, but that's to be expected (typically things work, we only log WARN/ERROR when bad things happen). I didn't say anything about all/nothing. Check the code, we have a number of messages at various levels, incl trace/debug. If you don't want to see the informational messages for a particular class you can configure that. As I pointed out earlier: http://hadoop.apache.org/zookeeper/docs/current/zookeeperInternals.html#sc_logging We consider both the messages you listed to be informational given that we expect/recover from the second.
        Hide
        Patrick Hunt added a comment -

        Hi Anthony, I realized in the shower this morning, by "Zookeeper" did you mean "ZooKeeper.java"? My bad.

        I looked at this class again and it does have logging at other levels than just info. Really it should have trace level logs for each of the api entry points. I'm concerned about pushing down the info level logs you highlighted though due to a couple factors; 1) in our experience those msgs are very useful to understand the runtime state of the client, 2) many users don't run in production at trace (and some don't want to run in debug).

        What's your rule of thumb for what should be logged at the various levels?

        Show
        Patrick Hunt added a comment - Hi Anthony, I realized in the shower this morning, by "Zookeeper" did you mean "ZooKeeper.java"? My bad. I looked at this class again and it does have logging at other levels than just info. Really it should have trace level logs for each of the api entry points. I'm concerned about pushing down the info level logs you highlighted though due to a couple factors; 1) in our experience those msgs are very useful to understand the runtime state of the client, 2) many users don't run in production at trace (and some don't want to run in debug). What's your rule of thumb for what should be logged at the various levels?
        Hide
        John L. Singleton added a comment -

        Hi Patrick and Anthony,

        If I may comment on this issue, I have recently had problems with Zookeeper and the verbosity of it's default logging. My problem is that a) the default logging is really too verbose in general and that b) including Zookeeper in your classpath causes Zookeeper's log4j.xml file to override established logging. To get around this I've actually had to repackage the zk jar without log4j.xml to tone down the logging.

        Now, as for the issue above, I agree that most of those messages, if they are used for "debugging" they really need to be at the DEBUG level, not INFO.

        Love ZK! Keep up the good work!

        Best,
        JLS

        Show
        John L. Singleton added a comment - Hi Patrick and Anthony, If I may comment on this issue, I have recently had problems with Zookeeper and the verbosity of it's default logging. My problem is that a) the default logging is really too verbose in general and that b) including Zookeeper in your classpath causes Zookeeper's log4j.xml file to override established logging. To get around this I've actually had to repackage the zk jar without log4j.xml to tone down the logging. Now, as for the issue above, I agree that most of those messages, if they are used for "debugging" they really need to be at the DEBUG level, not INFO. Love ZK! Keep up the good work! Best, JLS
        Hide
        John L. Singleton added a comment -

        If I could weigh in on this one too:

        "What's your rule of thumb for what should be logged at the various levels?"

        IMO:

        WARN: Recoverable errors
        ERROR: Non-recoverable errors
        INFO: Non error conditions pertaining operation, eg, for a SERVER side program, tell me that someone connected, but don't give me all the details about session establishment, etc. For a CLIENT side program, all these details should be omitted.
        TRACE: Information used for extended debugging, suitable for developers/contributors
        DEBUG: Information used for basic debugging, suitable for end users

        Show
        John L. Singleton added a comment - If I could weigh in on this one too: "What's your rule of thumb for what should be logged at the various levels?" IMO: WARN: Recoverable errors ERROR: Non-recoverable errors INFO: Non error conditions pertaining operation, eg, for a SERVER side program, tell me that someone connected, but don't give me all the details about session establishment, etc. For a CLIENT side program, all these details should be omitted. TRACE: Information used for extended debugging, suitable for developers/contributors DEBUG: Information used for basic debugging, suitable for end users
        Hide
        Patrick Hunt added a comment -

        Hi John, thanks for the feedback

        I've actually had to repackage the zk jar without log4j.xml to tone down the logging.

        well we don't have a log4j.xml at all, we do have a log4j.properties but that's not included in our primary or "maven bin" jar files. I just looked again and I don't see it included (at least in 3.3.1) Perhaps this is being pulled in from elsewhere in your environment?

        We do provide a default log4j.properties in the conf directory of the release artifact. However the intent is for users to either use that directly or customize it based on their needs, which is why it's a config file and not hardcoded.

        If I could weigh in on this one too... <logging levels>

        That's basically what we do. I grant that we skew to pushing more detail at INFO that might be DEBUG, and some DEBUG that should be TRACE. We have been working on that of late (see JIRA, a number of logging level changes went into 3.3). We could push a few more of the info messages to debug, but I'm reticent for a few reasons. Primarily the fact that we often get reports from users who see issues where the only detail we have to go on (esp given this is a complex distributed system) is the info (and higher) level logs. My experience supporting a number of production teams (15+) tells me this would compromise our ability to help them resolve issues.

        Show
        Patrick Hunt added a comment - Hi John, thanks for the feedback I've actually had to repackage the zk jar without log4j.xml to tone down the logging. well we don't have a log4j.xml at all, we do have a log4j.properties but that's not included in our primary or "maven bin" jar files. I just looked again and I don't see it included (at least in 3.3.1) Perhaps this is being pulled in from elsewhere in your environment? We do provide a default log4j.properties in the conf directory of the release artifact. However the intent is for users to either use that directly or customize it based on their needs, which is why it's a config file and not hardcoded. If I could weigh in on this one too... <logging levels> That's basically what we do. I grant that we skew to pushing more detail at INFO that might be DEBUG, and some DEBUG that should be TRACE. We have been working on that of late (see JIRA, a number of logging level changes went into 3.3). We could push a few more of the info messages to debug, but I'm reticent for a few reasons. Primarily the fact that we often get reports from users who see issues where the only detail we have to go on (esp given this is a complex distributed system) is the info (and higher) level logs. My experience supporting a number of production teams (15+) tells me this would compromise our ability to help them resolve issues.
        Hide
        Patrick Hunt added a comment -

        John could you give some examples of the log messages (ones actually output to the log) that you thought were excessive? (did you mean client or server? both?) It might help to frame the conversation. We might be able to address some of the more egregious ones.

        Here's a client session (default log4j.properties) where I created a client, ran a few commands, then quit (cli shell):

        2010-11-04 17:16:21,319 - INFO [main:ZooKeeper@373] - Initiating client connection, connectString= sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@2c6f7ce9
        2010-11-04 17:16:21,347 - INFO [main-SendThread():ClientCnxn$SendThread@1000] - Opening socket connection to server localhost/127.0.0.1:2181
        2010-11-04 17:16:21,392 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@908] - Socket connection established to localhost/127.0.0.1:2181, initiating session
        2010-11-04 17:16:21,486 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@701] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x12c1963f8210000, negotiated timeout = 30000

        then nothing until I quit

        2010-11-04 17:16:49,401 - INFO [main:ZooKeeper@538] - Session: 0x12c1963f8210000 closed

        So the entirety of the logging today is just 4 messages for the client establishment, one for closing the client. During the time that the client has an active session established there are no messages output. I don't see that as excessive personally, but others might not think the same. In my experience these are some very important messages when helping postmortem production failures.

        I could see where we might drop msgs 1-3 to debug level (keep 4, the "established" msg), as long as we highlight connection attempts that fail. Having detail about the sessionid and negotiated timeout is pretty critical though, from an informational perspective.

        Show
        Patrick Hunt added a comment - John could you give some examples of the log messages (ones actually output to the log) that you thought were excessive? (did you mean client or server? both?) It might help to frame the conversation. We might be able to address some of the more egregious ones. Here's a client session (default log4j.properties) where I created a client, ran a few commands, then quit (cli shell): 2010-11-04 17:16:21,319 - INFO [main:ZooKeeper@373] - Initiating client connection, connectString= sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@2c6f7ce9 2010-11-04 17:16:21,347 - INFO [main-SendThread():ClientCnxn$SendThread@1000] - Opening socket connection to server localhost/127.0.0.1:2181 2010-11-04 17:16:21,392 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@908] - Socket connection established to localhost/127.0.0.1:2181, initiating session 2010-11-04 17:16:21,486 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@701] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x12c1963f8210000, negotiated timeout = 30000 then nothing until I quit 2010-11-04 17:16:49,401 - INFO [main:ZooKeeper@538] - Session: 0x12c1963f8210000 closed So the entirety of the logging today is just 4 messages for the client establishment, one for closing the client. During the time that the client has an active session established there are no messages output. I don't see that as excessive personally, but others might not think the same. In my experience these are some very important messages when helping postmortem production failures. I could see where we might drop msgs 1-3 to debug level (keep 4, the "established" msg), as long as we highlight connection attempts that fail. Having detail about the sessionid and negotiated timeout is pretty critical though, from an informational perspective.

          People

          • Assignee:
            Anthony Urso
            Reporter:
            Anthony Urso
          • Votes:
            1 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development