Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Duplicate
-
0.8.2.0
-
None
-
None
Description
I'm seeing issues when sending a message with the new producer client API. The future returned from Producer.send() will block indefinitely if the cluster is unreachable for some reason. Here are the steps:
- Start up a single node kafka cluster locally.
- Start up application and create a KafkaProducer with the following config:
KafkaProducerWrapper values: compression.type = snappy metric.reporters = [] metadata.max.age.ms = 300000 metadata.fetch.timeout.ms = 60000 acks = all batch.size = 16384 reconnect.backoff.ms = 10 bootstrap.servers = [localhost:9092] receive.buffer.bytes = 32768 retry.backoff.ms = 100 buffer.memory = 33554432 timeout.ms = 30000 key.serializer = class com.mycompany.kafka.serializer.ToStringEncoder retries = 3 max.request.size = 1048576 block.on.buffer.full = true value.serializer = class com.mycompany.kafka.serializer.JsonEncoder metrics.sample.window.ms = 30000 send.buffer.bytes = 131072 max.in.flight.requests.per.connection = 5 metrics.num.samples = 2 linger.ms = 0 client.id = site-json
- Send some messages...they are successfully sent
- Shut down the kafka broker
- Send another message.
At this point, calling get() on the returned Future will block indefinitely until the broker is restarted.
It appears that there is some logic in org.apache.kafka.clients.producer.internal.Sender that is supposed to mark the Future as "done" in response to a disconnect event (towards the end of the run(long) method). However, the while loop earlier in this method seems to remove the broker from consideration entirely, so the final loop over ClientResponse objects is never executed.
It seems like "timeout.ms" configuration should be honored in this case, or perhaps introduce another timeout, indicating that we should give up waiting for the cluster to return.
Attachments
Issue Links
- duplicates
-
KAFKA-1788 producer record can stay in RecordAccumulator forever if leader is no available
- Resolved