Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.8.1
-
None
-
None
Description
Add sanity validation on streams start up that repartition topics are not setup with cleanup.policy of compact.
In enterprise envs automated creation of kafka streams intermediate topics is not always possible due to policy restrictions and as a result it is done manually which is prone to user misconfiguration.
In several cases we have found the repartition topics have been incorrectly setup following the changelog topic setup with compact enabled. The result being that a compacted repartition topic will result in data loss if more that one value is mapped to the new key. This has been noticed where aggregate follows a repartition topic and the aggregated value is incorrect.
Example:
Original data: (coffee, drink), (tea, drink), (beer, drink)
Repartition by type i.e. drink:
Expected:
(drink, coffee), (drink, tea), (drink, beer)
With compaction the following is possible:
Actual
(drink, beer);
coffee and tea are lost.