Description
Today whenever users specify a queryable store name for KTable, we would always add a physical state store in the translated processor topology.
For some scenarios, we should consider not physically materialize the KTable but only "logically" materialize it when you have some simple transformation operations or even join operations that generated new KTables, and which needs to be materialized with a state store, you can use the changelog topic of the previous KTable and applies the transformation logic upon restoration instead of creating a new changelog topic. For example:
table1 = builder.table("topic1"); table2 = table1.filter(..).join(table3); // table2 needs to be materialized for joining
We can actually set the getter function of table2's materialized store, say state2 to be reading from topic1 and then apply the filter operator, instead of creating a new state2-changelog topic in this case.
We can come up with a general internal impl optimizations to determine when to logically materialize, and whether we should actually allow users of DSL to "hint" whether to materialize or not (it then may need a KIP).
Attachments
Issue Links
- fixes
-
KAFKA-5581 Avoid creating changelog topics for state stores that are materialized from a source topic
- Resolved
- relates to
-
KAFKA-7577 Semantics of Table-Table Join with Null Message Are Incorrect
- Resolved
-
KAFKA-6761 Reduce Kafka Streams Footprint
- Resolved
- links to