Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-5778

Kafka cluster is not responding when one broker hangs and resulted in too many connections in close_wait in other brokers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 0.10.0.1
    • None
    • None
    • None

    Description

      In a cluster of 3 brokers , one of the broker(Broker-1 ) is hanged and from then other two brokers has connections in close_wait for java client producer/consumer and also even some broker to broker connections are in close wait among those two brokers.
      Kafka Version : 0.10.0.1
      In logs I found replica fetcher thread connection refused exceptions:
      In broker 0 : replica fetcher 0-1, replica fetcher 0-2
      In broker 2 : replica fetcher 0-0, replica fetcher 0-1
      In broker 1 : It was hung no logs were available at that time.

      We tried restarting broker- 2 kafka and then it was not successful as it terminated saying zookeeper timeout
      then we tried restarting broker- 0 kafka and we got the same error
      Broker -1 was hang so , we could not login even into it
      so we restarted broker -1 machine
      then we restarted all zookepers and then kafka brokers now everything is fine

      Attachments

        Activity

          People

            Unassigned Unassigned
            saichand saichand
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: