Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4183

Leader election not working when using hostname in server config and hostname resolves to an internal IP addresses

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: server
    • Labels:
      None

      Description

      We have a working setup with ZooKeeper 3.5.6 with 3 servers and server config similar to:

      server.1=1.foo.com:2182:2183:participant
      server.2=2.foo.com:2182:2183:participant
      server.3=3.foo.com:2182:2183:participan

      ZooKeeper servers are running in Docker containers and have IP addresses in the 10.x.x.x range, but also IP addresses from the default Docker network (172.17.x.x addresses) that are only usable inside the Docker
      container, /etc/hosts has e.g.:

      172.17.2.192 1.foo.com

      When upgrading to 3.5.7 leader election failed with the following error in zookeeper log:

      .org.apache.zookeeper.server.quorum.QuorumCnxManager Received connection request 10.2.2.192:37028
      .org.apache.zookeeper.server.quorum.UnifiedServerSocket Accepted TLS connection from /10.2.2.192:37028 - TLSv1.2 - TLS_ECDHE_RSA_WITH_AES_128_GCM_SH
      A256
      .org.apache.zookeeper.server.quorum.QuorumCnxManager
      Cannot open channel to 0 at election address /172.17.2.192:2183
      exception=
      java.net.NoRouteToHostException: No route to host (Host unreachable)
      at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
      at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
      at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
      at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
      at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
      at java.base/java.net.Socket.connect(Socket.java:609)
      at java.base/sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:285)

      We tried using version 3.6.2, but had the same issue there.

      After reading the code and looking at changes between 3.5.6 and 3.5.7 I
      found https://issues.apache.org/jira/browse/ZOOKEEPER-3057 and the corresponding PR https://github.com/apache/zookeeper/pull/548
      It seems like this PR changed the way election addresses are sent in the message when doing leader election, from using
      hostnames (when hostnames are specified in server config) to always using IP addresses.

      Patching 3.6.2 with the following change makes this work again for us:

      diff --git a/zookeeper-server/zookeeper-server-3.6.2/src/main/java/org/apache/zookeeper/common/NetUtils.java b/zookeeper-server/zookeeper-server-3.6.2/src/main/java/org/apache/zookeeper/common/NetUtils.java
      index be8cb9a638..f32f1da7c8 100644
      --- a/zookeeper-server/zookeeper-server-3.6.2/src/main/java/org/apache/zookeeper/common/NetUtils.java
      +++ b/zookeeper-server/zookeeper-server-3.6.2/src/main/java/org/apache/zookeeper/common/NetUtils.java
      @@ -27,13 +27,18 @@ import java.net.InetSocketAddress;
        */
       public class NetUtils {+    // Note: Changed from original to use hostname from InetSocketAddress if there exists one
           public static String formatInetAddr(InetSocketAddress addr) {
      +        String hostName = addr.getHostName();
      +        if (hostName != null) {
      +            return String.format("%s:%s", hostName, addr.getPort());
      +        }
      +
               InetAddress ia = addr.getAddress();         if (ia == null) {
                   return String.format("%s:%s", addr.getHostString(), addr.getPort());
               }         if (ia instanceof Inet6Address) {
                   return String.format("[%s]:%s", ia.getHostAddress(), addr.getPort());
               } else {

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              hmusum Harald Musum

              Dates

              • Created:
                Updated:

                Issue deployment