Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7931

Java Client: if all ephemeral brokers fail, client can never reconnect to brokers

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 2.1.0
    • Fix Version/s: None
    • Component/s: clients
    • Labels:
      None

      Description

      Steps to reproduce:

      • Setup kafka cluster in GKE, with bootstrap server address configured to point to a load balancer that exposes all GKE nodes
      • Run producer that emits values into a partition with 3 replicas
      • Kill every broker in the cluster
      • Wait for brokers to restart

      Observed result:

      The java client cannot find any of the nodes even though they have all recovered. I see messages like "Connection to node 30 (/10.6.0.101:9092) could not be established. Broker may not be available.".

      Note, this is not a duplicate of https://issues.apache.org/jira/browse/KAFKA-7890. I'm using the client version that contains the fix for https://issues.apache.org/jira/browse/KAFKA-7890.

      Versions:

      Kakfa: kafka version 2.1.0, using confluentinc/cp-kafka/5.1.0 docker image

      Client: trunk from a few days ago (git sha 9f7e6b291309286e3e3c1610e98d978773c9d504), to pull in the fix for KAFKA-7890

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                BrianAttwell Brian
              • Votes:
                2 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated: