[KAFKA-7941] Connect KafkaBasedLog work thread terminates when getting offsets fails because broker is unavailable - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 2.0.0
Fix Version/s: 2.0.2, 2.1.2, 2.2.2, 2.4.0, 2.3.1
Component/s: connect
Labels:
None

Description

My team has run into this Connect bug regularly in the last six months while doing infrastructure maintenance that causes intermittent broker availability issues. I'm a little surprised it exists given how routinely it affects us, so perhaps someone in the know can point out if our setup is somehow just incorrect. My team is running 2.0.0 on both the broker and client, though from what I can tell from reading the code, the issue continues to exist through 2.2; at least, I was able to write a failing unit test that I believe reproduces it.

When a KafkaBasedLog worker thread in the Connect runtime calls readLogToEnd and brokers are unavailable, the TimeoutException from the consumer endOffsets call is uncaught all the way up to the top level catch (Throwable t), effectively killing the thread until restarting Connect. The result is Connect stops functioning entirely, with no indication except for that log line - tasks still show as running.

The proposed fix is to simply catch and log the TimeoutException, allowing the worker thread to retry forever.

Alternatively, perhaps there is not an expectation that Connect should be able to recover following broker unavailability, though that would be disappointing. I would at least hope hope for a louder failure then the single ERROR log.

Attachments

Issue Links

is related to

KAFKA-8790 [kafka-connect] KafkaBaseLog.WorkThread not recoverable

Open

relates to

KAFKA-8485 Kafka connect worker does not respond/function when kafka broker goes down.

Open

KAFKA-6608 Add TimeoutException to KafkaConsumer#position()

Resolved

links to

GitHub Pull Request #6283

Activity

People

Assignee:: Paul Whalen

Reporter:: Paul Whalen

Reviewer:: Randall Hauch

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 18/Feb/19 01:00

Updated:: 11/Sep/19 14:46

Resolved:: 13/Aug/19 22:25