Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Samza 1.5 enabled transaction state feature by default for all samza jobs.
We uncovered a bug related to reverting changelog state to last checkpoint (trimming), which resulted in container stuck in the restoration phase forever. This happened due to the trimming phase of state restore: when uncheckpointed messages in the changelog have their values reverted according to the job's last checkpoint. If a job needed to trim a non-zero number of messages, these trimmed messages would be repeatedly read and re-written by the restore process infinitely preventing the job from completing startup.