Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-3916

Connection from controller to broker disconnects

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0.1
    • Fix Version/s: 0.10.1.0
    • Component/s: controller
    • Labels:
      None

      Description

      We recently upgraded from 0.8.2.1 to 0.9.0.1. Since then, several times per day, the controllers in our clusters have their connection to all brokers disconnected, and then successfully reconnected a few hundred ms later. Each time this occurs we see a brief spike in our 99th percentile produce and consume times, reaching several hundred ms.

      Here is an example of what we're seeing in the controller.log:

      [2016-06-28 14:15:35,416] WARN [Controller-151-to-broker-160-send-thread], Controller 151 epoch 106 fails to send request {…} to broker Node(160, broker.160.hostname, 9092). Reconnecting to broker. (kafka.controller.RequestSendThread)
      java.io.IOException: Connection to 160 was disconnected before the response was read
              at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87)
              at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84)
              at scala.Option.foreach(Option.scala:236)
              at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84)
              at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80)
              at kafka.utils.NetworkClientBlockingOps$.recurse$1(NetworkClientBlockingOps.scala:129)
              at kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollUntilFound$extension(NetworkClientBlockingOps.scala:139)
              at kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80)
              at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:180)
              at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:171)
              at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
      
      ... one each for all brokers (including the controller) ...
      
       [2016-06-28 14:15:35,721] INFO [Controller-151-to-broker-160-send-thread], Controller 151 connected to Node(160, broker.160.hostname, 9092) for sending state change requests (kafka.controller.RequestSendThread)
      
      … one each for all brokers (including the controller) ...
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hachikuji Jason Gustafson
                Reporter:
                poweld Dave Powell
              • Votes:
                11 Vote for this issue
                Watchers:
                22 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: