Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.0.2
-
None
Description
Current behavior of Kafka spout emits duplicate tuples whenever Kafka topic leader's change.
In case of exception caused by leader changes, PartitionManagers are simply recreated losing the state about which tuples were already emitted and new PartitionManager re-emits them again.
This is fine as at-least-once is fulfilled, but still it would be better to not emit duplicate data if possible.
Moreover this could be easily avoided by moving the state related to emitted tuples from old PartitionManager to new one.
Pull requests implementing this:
1.0.x-branch - https://github.com/apache/storm/pull/1873
1.x-branch - https://github.com/apache/storm/pull/1888
Pull request for related bugfix: https://github.com/apache/storm/pull/1940
Attachments
Issue Links
- is related to
-
STORM-2361 Kafka spout - after topic leader change, it stops committing offsets to ZK
- Resolved
- relates to
-
STORM-3090 The same offset value is used by the same partition number of different topics.
- Resolved
- links to