[KAFKA-1286] Retry Can Block - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: producer
Labels:
None

Description

Under the following scenario the retry logic can block

1. The last broker's socket closed, sender.handleDisconnect() triggered, put the node as disconnected.

2. In the next sender.run(), since the node is disconnected, remove the partition from ready set, and call sender.initConnection(), which will not throw exception.

3. So in this round of send, the only request it tries to send to is the metadata request, to the last broker; and the sender will firstly try to connect to that broker.

4. In selector.poll(), the finishConnect() call will throw exception, and in handleDisconnects(), inFlight request's batches will be null since it is a metadata request.

5. Now we will go back to 1, and loop forever. Note that this infinite loop can be triggered even without calling producer.close.

Also, we need to introduce the retry backoff config, otherwise the retries will be exhausted too soon (in my tests 10 retries can be exhausted in about 600ms).

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

KAFKA-1286_2014-03-04_17:56:47.patch
05/Mar/14 01:57
31 kB
Guozhang Wang
KAFKA-1286_2014-03-04_15:14:49.patch
04/Mar/14 23:15
42 kB
Guozhang Wang
KAFKA-1286_2014-03-04_11:04:32.patch
04/Mar/14 19:04
21 kB
Guozhang Wang
KAFKA-1286.patch
04/Mar/14 18:50
17 kB
Guozhang Wang

Activity

People

Assignee:: Unassigned

Reporter:: Guozhang Wang

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 01/Mar/14 01:48

Updated:: 05/Mar/14 22:54

Resolved:: 05/Mar/14 22:40