Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-2893

very poor choice of logging if client fails to connect to server

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.4.6, 3.5.3, 3.6.0
    • Fix Version/s: 3.5.4, 3.6.0, 3.4.12
    • Component/s: java client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We are using ZooKeeper in our project and have received reports that, when suffering a networking problem, log files become flooded with messages like:

      07 Sep 2017 08:22:00 (System) [] Session 0x45d3151be3600a9 for server null, unexpected error, closing socket connection and attempting reconnect
      java.net.NoRouteToHostException: No route to host
      at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_131]
      at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_131]
      at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
      at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) ~[zookeeper-3.4.6.jar:3.4.6-1569965]

      Looking at the code that logs this message (ClientCnxn), there seems to be quite a few problems here:

      1. the code logs a stack-trace, even though there is no bug here. In our project, we treat all logged stack-traces as bugs,
      2. if the networking issue is not fixed promptly, the log files is flooded with these message,
      3. The message is built using ClientCnxnSocket#getRemoteSocketAddress, yet in this case, this does not provide the expected information (yielding null),
      4. The log message fails to include a description of what actually went wrong.

      (Additionally, the code uses string concatenation rather than templating when building the message; however, this is an optimisation issue)

      My suggestion is that this log entry is updated so that it doesn't log a stack-trace, but does include some indication why the connection failed.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                andorm Andor Molnar
                Reporter:
                paulmillar Paul Millar
              • Votes:
                1 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: