[KAFKA-12523] Need to improve handling of TimeoutException when committing offsets - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: 2.8.0
Fix Version/s: 2.8.0
Component/s: streams
Labels:
None

Description

Right now, in TaskManager#commitOffsetsOrTransaction if we catch a TimeoutException then under ALOS we just rethrow it while in EOS we rethrow it as TaskCorruptedException. The problem is that commitOffsetsOrTransaction can be invoked from several places:

Commit within StreamThread main processing loop (either user requested or commit interval has elapsed: this is presumably the case we had in mind when deciding how to handle the TimeoutException in commitOffsetsOrTransaction , no problem here
Clean shutdown of application: a bit weird to throw a TaskCorruptedException in this case, but it’ll just end up being caught and forcing a closeDirty, so again no problem here
From TaskManager#handleRevocation: in this case, it’s possible we hit a TimeoutException on a task that’s actually being revoked. This exception will be saved and rethrown from poll, so under EOS we would catch a TaskCorruptedException and then try to revive this task that we actually no longer own. Pretty sure this will cause an NPE in the TaskManager. Under ALOS, the rethrown TimeoutException will be bubbled up through poll again, but unlike TaskCorruptedException we actually don’t catch TimeoutException anywhere in the StreamThread loop. This will trigger the uncaught exception handler
From TaskManager#handleTaskCorrupted: this method is itself invoked from within the catch TaskCorruptedException block of the StreamThread’s runLoop. If we throw TaskCorruptedException again then I believe we won’t even catch this in the safety net catch Throwable block of the runLoop – it’ll just be thrown directly up through run().

Attachments

Issue Links

links to

GitHub Pull Request #10407

Activity

People

Assignee:: A. Sophie Blee-Goldman

Reporter:: A. Sophie Blee-Goldman

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 23/Mar/21 03:41

Updated:: 30/Mar/21 17:08

Resolved:: 29/Mar/21 21:24