Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-16817

FK Join Response topic can be lower volume

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • streams

    Description

      IIRC, the current state is that when right-hand-side records get updated, they scan the subscription registry and send a separate "response" message for each left-hand-side record that needs to be updated.

      There are two problems with this approach:

      1. the partitioning strategy of the left-hand topic is not known in Streams, so there's an opportunity for a bug to route the responses incorrectly. This was previously a design bug, and we added a partitioner argument to the FKJoin to at least allow users to avoid it.
      2. there may be an arbitrary number of left-hand-side records (tens, hundreds, or thousands or more), so the response topic itself may become a bottleneck. However, we really only do this for routing , and there's a fix number of routing destinations, the number of partitions on the left-hand side.

      We can fix both of these problems by storing the partition of the subscription message, then grouping triggered lhs keys by partition and sending fewer, larger, "response" messages back to the lhs when the rhs records are updated. We can send these messages explicitly to the partition that the subscriptions originally came from, as opposed to running the partitioner on the lhs key.

      Attachments

        Activity

          People

            Unassigned Unassigned
            vvcephei John Roesler
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: