[KAFKA-10471] TimeIndex handling may cause data loss in certain back to back failure - ASF JIRA

XML

Word

Printable

JSON

Active segment for log A going clean shutdown - trim the time index to the latest fill value, set the clean shutdown marker.
Broker restarts, loading logs - no recovery due to clean shutdown marker, log A recovers with the previous active segment as current. It also resized the TimeIndex to the max.
Before all the log loads, the broker had a hard shutdown causing a clean shutdown marker left as is.
Broker restarts, log A skips recovery due to the presence of a clean shutdown marker but the TimeIndex file assumes the resized file from the previous instance is all full (it assumes either file is newly created or is full with valid value).
The first append to the active segment will result in roll and TimeIndex will be rolled with the timestamp value of the last valid entry (0)
Segment's largest timestamp gives 0 (this can cause premature deletion of data due to retention.

links to

GitHub Pull Request #9364