Description
We'd like to add a new API to the KStream object of the Streams DSL:
KTable KStream.toTable() KTable KStream.toTable(Materialized)
The function re-interpret the event stream KStream as a changelog stream KTable. Note that this should NOT be treated as a syntax-sugar as a dummy KStream.reduce() function which always take the new value, as it has the following difference:
1) an aggregation operator of KStream is for aggregating a event stream into an evolving table, which will drop null-values from the input event stream; whereas a toTable function will completely change the semantics of the input stream from event stream to changelog stream, and null-values will still be serialized, and if the resulted bytes are also null they will be interpreted as "deletes" to the materialized KTable (i.e. tombstones in the changelog stream).
2) the aggregation result KTable will always be materialized, whereas toTable resulted KTable may only be materialized if the overloaded function with Materialized is used (and if optimization is turned on it may still be only logically materialized if the queryable name is not set).
Therefore, for users who want to take a event stream into a changelog stream (no matter why they cannot read from the source topic as a changelog stream KTable at the beginning), they should be using this new API instead of the dummy reduction function.
Attachments
Issue Links
- Is contained by
-
KAFKA-9483 Add Scala KStream#toTable to the Streams DSL
- Resolved
- relates to
-
KAFKA-7577 Semantics of Table-Table Join with Null Message Are Incorrect
- Resolved
- links to