[KAFKA-10664] Streams fails to overwrite corrupted offsets leading to infinite OffsetOutOfRangeException loop - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: 2.7.0
Fix Version/s: 2.7.0
Component/s: streams
Labels:
None

Description

In ~~KAFKA-10391~~ we fixed an issue where Streams could get stuck in an infinite loop of OffsetOutOfRangeException/TaskCorruptedException due to re-initializing the corrupted offsets from the checkpoint after each revival. The fix we applied was to remove the corrupted offsets from the state manager and then force it to write a new checkpoint file without those offsets during revival.

Unfortunately we missed that there's an optimization in OffsetCheckpoint#write to just return without writing anything when there's no offsets. So if a task doesn't have any offsets that aren't corrupted, it will skip overwriting the corrupted checkpoint.

Probably we should just fix the optimization in OffsetCheckpoint so that it deletes the current checkpoint in the case there are no offsets to write

Attachments

Issue Links

links to

GitHub Pull Request #9534

Activity

People

Assignee:: A. Sophie Blee-Goldman

Reporter:: A. Sophie Blee-Goldman

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 30/Oct/20 01:48

Updated:: 03/Nov/20 14:06

Resolved:: 30/Oct/20 23:57