Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-15828

Protect clients from broker hostname reuse



    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • clients, consumer, producer


      In some environments such as k8s, brokers may be assigned to nodes dynamically from an available pool. When a cluster is rolling, it is possible for the client to see the same node advertised for different broker IDs in a short period of time. For example, kafka-1 might be initially assigned to node1. Before the client is able to establish a connection, it could be that kafka-3 is now on node1 instead. Currently there is no protection in the client or in the protocol for this scenario. If the connection succeeds, the client will assume it has a good connection to kafka-1. Until something disrupts the connection, it will continue under this assumption even if the hostname for kafka-1 changes.

      We have observed this scenario in practice. The client connected to the wrong broker through stale hostname information. It was unable to produce data because of persistent NOT_LEADER errors. The only way to recover in the end was by restarting the client to force a reconnection.

      We have discussed a couple potential solutions to this problem:

      1. Let the client be smarter managing the connection/hostname mapping. When it detects that a hostname has changed, it should force a disconnect to ensure it connects to the right node.
      2. We can modify the protocol to verify that the client has connected to the intended broker. For example, we can add a field to ApiVersions to indicate the intended broker ID. The broker receiving the request can return an error if its ID does not match that in the request.

      Are there alternatives? 






            Unassigned Unassigned
            hachikuji Jason Gustafson
            0 Vote for this issue
            2 Start watching this issue