Details
Description
n.b. this bug has been verified with exactly-once processing enabled
Consider the following scenario:
- A record, N is read into a Kafka topology
- the state store is updated
- the topology crashes
Expected behaviour:
- Node is restarted
- Offset was never updated, so record N is reprocessed
- State-store is reset to position N-1
- Record is reprocessed
Actual Behaviour
- Node is restarted
- Record N is reprocessed (good)
- The state store has the state from the previous processing
I'd consider this a corruption of the state-store, hence the critical Priority, although High may be more appropriate.
I wrote a proof-of-concept here, which demonstrates the problem on Linux: