Kafka
  1. Kafka
  2. KAFKA-620

UnknownHostError looking for a ZK node crashes the broker

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.7.1
    • Fix Version/s: None
    • Component/s: core
    • Labels:
      None
    • Environment:
      linux. Amazon's AMI

      Description

      If you totally kill a zookeeper node so that it's hostname no longer resolves to anything, the broker will die with a java.net.UnknownHostException.

      You will then be unable to restart the broker until the unknown host(s) is removed from the server.properties.

      We ran into this issue while testing our resilience to widespread AWS outages, if you can point me to the right place, I could have a go at fixing it? Unfortunately, I suspect the issue might be in the non-standard Zookeeper library that kafka uses.

      Here's the stack trace:
      org.I0Itec.zkclient.exception.ZkException: Unable to connect to [list of zookeepers]
      at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:66)
      at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:872)
      at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
      at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
      at kafka.server.KafkaZooKeeper.startup(KafkaZooKeeper.scala:44)
      at kafka.log.LogManager.<init>(LogManager.scala:87)
      at kafka.server.KafkaServer.startup(KafkaServer.scala:58)
      at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
      at kafka.Kafka$.main(Kafka.scala:50)
      at kafka.Kafka.main(Kafka.scala)
      Caused by: java.net.UnknownHostException: zk-101
      at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
      at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:850)
      at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1201)
      at java.net.InetAddress.getAllByName0(InetAddress.java:1154)
      at java.net.InetAddress.getAllByName(InetAddress.java:1084)
      at java.net.InetAddress.getAllByName(InetAddress.java:1020)
      at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:387)
      at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:332)
      at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:383)
      at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:64)
      ... 9 more

        Issue Links

          Activity

          Matthew Rathbone created issue -
          Matthew Rathbone made changes -
          Field Original Value New Value
          Description If you totally kill a zookeeper node so that it's hostname no longer resolves to anything, the broker will die with a java.net.UnknownHostException.

          You will then be unable to restart the broker until the unknown host(s) is removed from the server.properties.

          We ran into this issue while testing our resilience to widespread AWS outages, if you can point me to the right place, I could have a go at fixing it? Unfortunately, I suspect the issue might be in the non-standard Zookeeper library that kafka uses.
          If you totally kill a zookeeper node so that it's hostname no longer resolves to anything, the broker will die with a java.net.UnknownHostException.

          You will then be unable to restart the broker until the unknown host(s) is removed from the server.properties.

          We ran into this issue while testing our resilience to widespread AWS outages, if you can point me to the right place, I could have a go at fixing it? Unfortunately, I suspect the issue might be in the non-standard Zookeeper library that kafka uses.


          Here's the stack trace:
          org.I0Itec.zkclient.exception.ZkException: Unable to connect to [list of zookeepers]
          at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:66)
          at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:872)
          at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
          at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
          at kafka.server.KafkaZooKeeper.startup(KafkaZooKeeper.scala:44)
          at kafka.log.LogManager.<init>(LogManager.scala:87)
          at kafka.server.KafkaServer.startup(KafkaServer.scala:58)
          at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
          at kafka.Kafka$.main(Kafka.scala:50)
          at kafka.Kafka.main(Kafka.scala)
          Caused by: java.net.UnknownHostException: zk-101
          at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
          at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:850)
          at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1201)
          at java.net.InetAddress.getAllByName0(InetAddress.java:1154)
          at java.net.InetAddress.getAllByName(InetAddress.java:1084)
          at java.net.InetAddress.getAllByName(InetAddress.java:1020)
          at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:387)
          at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:332)
          at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:383)
          at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:64)
          ... 9 more
          Guozhang Wang made changes -
          Link This issue duplicates KAFKA-1082 [ KAFKA-1082 ]
          Guozhang Wang made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]

            People

            • Assignee:
              Unassigned
              Reporter:
              Matthew Rathbone
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development