Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.0.0
-
None
-
None
Description
When a consumer request times out, i.e. takes longer than request.timeout.ms, and the client disconnects from the coordinator, the coordinator may leak file descriptors. The following code produces this behavior:
Properties config = new Properties(); config.put("bootstrap.servers", BROKERS); config.put("group.id", "leak-test"); config.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); config.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); config.put("max.poll.interval.ms", Integer.MAX_VALUE); config.put("request.timeout.ms", 12000); KafkaConsumer<String, String> consumer1 = new KafkaConsumer<>(config); KafkaConsumer<String, String> consumer2 = new KafkaConsumer<>(config); List<String> topics = Collections.singletonList("leak-test"); consumer1.subscribe(topics); consumer2.subscribe(topics); consumer1.poll(100); consumer2.poll(100);
When the above executes, consumer 2 will attempt to rebalance indefinitely (blocked by the inactive consumer 1), logging a Marking the coordinator dead message every 12 seconds after giving up on the JOIN_GROUP request and disconnecting. Unless the consumer exits or times out, this will cause a socket in CLOSE_WAIT to leak in the coordinator and the broker will eventually run out of file descriptors and crash.
Aside from faulty code as in the example above, or an intentional DoS, any client bug causing a consumer to block, e.g. KAFKA-6397, could also result in this leak.
Attachments
Issue Links
- is related to
-
KAFKA-5586 Handle client disconnects during JoinGroup
- Open