Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-6903

Improve KTable's sending old value behavior

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • streams
    • None

    Description

      Today in KTable's internal implementation, if old values are needed in the down stream (e.g. if there is an aggregation down stream so that old values need to be re-send to "subtract" its effects in addition to incorporate the effects of new values), we will re-compute the old values based on the parent's passed in old values. This behavior has two issues:

      1) re-computing the values again means more cost: for each updated value, they are computed twice, once as the new value and once as the old value. This additional cost can ideally be saved.

      2) if the computational logic is dependent on some state which could be updated over time, then calling the same applied function again may actually result in different values, due to the different state's snapshot.

      We should consider how to improve this behavior to avoid the above issues. More specifically: if the KTable is materialized, we can consider reading the old value directly from the materialized store (note that it may not be always the best, e.g. doing a simple filter v.s. reading from a persistent store, which takes more time? How about applying a transformer blackbox? How about doing a join?)

      Attachments

        Activity

          People

            Unassigned Unassigned
            guozhang Guozhang Wang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: