[KAFKA-10015] React Smartly to Unexpected Errors on Stream Threads - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.5.0
Fix Version/s: None
Component/s: streams
Labels:
- kip

Description

Currently, if an unexpected error occurs on a stream thread, the stream thread dies, a rebalance is triggered, and the Streams' client continues to run with less stream threads.

Some errors trigger a cascading of stream thread death, i.e., after the rebalance that resulted from the death of the first thread the next thread dies, then a rebalance is triggered, the next thread dies, and so forth until all stream threads are dead and the instance shuts down. Such a chain of rebalances could be avoided if an error could be recognized as the cause of cascading stream deaths and as a consequence the Streams' client could be shut down after the first stream thread death.

On the other hand, some unexpected errors are transient and the stream thread could safely be restarted without causing further errors and without the need to restart the Streams' client.

The goal of this ticket is to classify errors and to automatically react to the errors in a way to avoid cascading deaths and to recover stream threads if possible.

KIP-663: https://cwiki.apache.org/confluence/display/KAFKA/KIP-663%3A+API+to+Start+and+Shut+Down+Stream+Threads

Attachments

Issue Links

is related to

KAFKA-4748 Need a way to shutdown all workers in a Streams application at the same time

Closed

KAFKA-10500 Add API to Start and Stop Stream Threads

Closed

KAFKA-6943 Have option to shutdown KS cleanly if any threads crashes, or if all threads crash

Closed

mentioned in: Page Loading...

Activity

People

Assignee:: Walker Carlson

Reporter:: Bruno Cadonna

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 18/May/20 15:25

Updated:: 29/Jan/21 20:53

Resolved:: 29/Jan/21 20:52