Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
2.8.0
-
None
-
None
Description
We've observed a misbehaving Task fail to stop in a timely manner (e.g. stuck in a retry loop). Despite Connect supporting a property task.shutdown.graceful.timeout.ms, this is currently not enforced – tasks can take as long as they want to stop, and the only consequence is an error message.
We've seen a Worker's "task-count" metric double following a rebalance, which we think is due to Tasks not getting cleaned up when Task.stop() is stuck.
While the Connector implementation is ultimately to blame here – a Task probably shouldn't loop forever in stop() – we believe the Connect runtime should handle this situation more gracefully.
Attachments
Issue Links
- duplicates
-
KAFKA-10792 Source tasks can block herder thread by hanging during stop
- Resolved
- links to