Description
This is a bug discovered with the new EOS protocol (KIP-447), here's the context:
In Streams when we are assigned with the new active tasks, we would first try to restore the state from the changelog topic all the way to the log end offset, and then we can transit from the `restoring` to the `running` state to start processing the task.
Before KIP-447, the end-offset call is only triggered after we've passed the synchronization barrier at the txn-coordinator which would guarantee that the txn-marker has been sent and received (otherwise we would error with CONCURRENT_TRANSACTIONS and let the producer retry), and when the txn-marker is received, it also means that the marker has been fully replicated, which in turn guarantees that the data written before that marker has been fully replicated. As a result, when we send the list-offset with `read-committed` flag we are guaranteed that the returned offset == LSO == high-watermark.
After KIP-447 however, we do not fence on the txn-coordinator but on group-coordinator upon offset-fetch, and the group-coordinator would return the fetching offset right after it has received the replicated the txn-marker sent to it. However, since the txn-marker are sent to different brokers in parallel, and even within the same broker markers of different partitions are appended / replicated independently as well, so when the fetch-offset request returns it is NOT guaranteed that the LSO on other data partitions would have been advanced as well. And hence in that case the `endOffset` call may returned a smaller offset, causing data loss.
Attachments
Issue Links
- causes
-
KAFKA-10148 Kafka Streams Restores too few Records with eos-beta Enabled
- Closed
- links to