[KAFKA-12726] misbehaving Task.stop() can prevent other Tasks from stopping - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 2.8.0
Fix Version/s: None
Component/s: connect
Labels:
None

Description

We've observed a misbehaving Task fail to stop in a timely manner (e.g. stuck in a retry loop). Despite Connect supporting a property task.shutdown.graceful.timeout.ms, this is currently not enforced – tasks can take as long as they want to stop, and the only consequence is an error message.

We've seen a Worker's "task-count" metric double following a rebalance, which we think is due to Tasks not getting cleaned up when Task.stop() is stuck.

While the Connector implementation is ultimately to blame here – a Task probably shouldn't loop forever in stop() – we believe the Connect runtime should handle this situation more gracefully.

Attachments

Issue Links

duplicates

KAFKA-10792 Source tasks can block herder thread by hanging during stop

Resolved

links to

GitHub Pull Request #10605

Activity

People

Assignee:: Ryanne Dolan

Reporter:: Ryanne Dolan

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 28/Apr/21 16:09

Updated:: 03/May/21 19:13

Resolved:: 30/Apr/21 20:10