Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
https://issues.apache.org/jira/browse/KAFKA-12455 documented how a Java client can temporarily lose connectivity to a 2-broker cluster that is undergoing a roll because the client will repeatedly retry connecting to the last alive broker that it knows about in the cluster metadata even when that broker is unavailable. The client could potentially fallback to its bootstrap brokers in this case and reconnect to the cluster quicker.
For example, assume a 2-broker cluster has broker IDs 1 and 2 and both appear in the bootstrap servers for a consumer. Assume broker 1 rolls such that the Java consumer receives a new METADATA response and only knows about broker 2 being alive, and then broker 2 rolls before the consumer gets a new METADATA response indicating that broker 1 is also alive. At this point the Java consumer will keep retrying broker 2, and it will not reconnect to the cluster unless/until broker 2 becomes available – or the client itself is restarted so it can use its bootstrap servers again. Another possibility is to fallback to the full bootstrap servers list when the last alive broker becomes unavailable.
I believe librdkafka-based client may perform this fallback, though I am not certain. We should consider it for Java clients.
Attachments
Issue Links
- duplicates
-
KAFKA-13653 Proactively discover alive brokers from bootstrap server lists when all nodes are down
- Resolved
- is related to
-
KAFKA-7931 Java Client: if all ephemeral brokers fail, client can never reconnect to brokers
- Open
-
KAFKA-8206 A consumer can't discover new group coordinator when the cluster was partly restarted
- In Progress
-
KAFKA-3068 NetworkClient may connect to a different Kafka cluster than originally configured
- Resolved
-
KAFKA-14548 Stable streams applications stall due to infrequent restoreConsumer polls
- Resolved
- mentioned in
-
Page Loading...