Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7931

Java Client: if all ephemeral brokers fail, client can never reconnect to brokers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 2.1.0
    • None
    • clients
    • None

    Description

      Steps to reproduce:

      • Setup kafka cluster in GKE, with bootstrap server address configured to point to a load balancer that exposes all GKE nodes
      • Run producer that emits values into a partition with 3 replicas
      • Kill every broker in the cluster
      • Wait for brokers to restart

      Observed result:

      The java client cannot find any of the nodes even though they have all recovered. I see messages like "Connection to node 30 (/10.6.0.101:9092) could not be established. Broker may not be available.".

      Note, this is not a duplicate of https://issues.apache.org/jira/browse/KAFKA-7890. I'm using the client version that contains the fix for https://issues.apache.org/jira/browse/KAFKA-7890.

      Versions:

      Kakfa: kafka version 2.1.0, using confluentinc/cp-kafka/5.1.0 docker image

      Client: trunk from a few days ago (git sha 9f7e6b291309286e3e3c1610e98d978773c9d504), to pull in the fix for KAFKA-7890

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              BrianAttwell Brian
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: