[KAFKA-4748] Need a way to shutdown all workers in a Streams application at the same time - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.10.1.1
Fix Version/s: 2.8.0
Component/s: streams
Labels:
None

Description

If you have a fleet of Stream workers for an application and attempt to shut them down simultaneously (e.g. via SIGTERM and Runtime.getRuntime().addShutdownHook() and streams.close())), a large number of the workers fail to shutdown.

The problem appears to be a race condition between the shutdown signal and the consumer rebalancing that is triggered by some of the workers existing before others. Apparently, workers that receive the signal later fail to exit apparently as they are caught in the rebalance.

Terminating workers in a rolling fashion is not advisable in some situations. The rolling shutdown will result in many unnecessary rebalances and may fail, as the application may have large amount of local state that a smaller number of nodes may not be able to store.

It would appear that there is a need for a protocol change to allow the coordinator to signal a consumer group to shutdown without leading to rebalancing.

Attachments

Issue Links

relates to

KAFKA-6943 Have option to shutdown KS cleanly if any threads crashes, or if all threads crash

Closed

KAFKA-10015 React Smartly to Unexpected Errors on Stream Threads

Closed

Activity

People

Assignee:: Walker Carlson

Reporter:: Elias Levy

Votes:: 1 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 08/Feb/17 23:38

Updated:: 24/Feb/21 23:20

Resolved:: 18/Nov/20 23:50