Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7294

Optimize repartitioning for merge()

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • streams
    • None

    Description

      For a merge() operator we check at compile time, if one of the input KStreams requires repartitioning and set the "requiresRepartitioning" flag for the output  KStream for this case. This implies, that we pipe data from all input KStreams through the repartition topic after the merge().

      Using our optimizer, we could push down the repartition operation before the merge() to only repartition the KStream(s) that required repartition and thus save network IO for all KStreams that don't require repartitioning.

      Note, that this operation is only correct, if all input streams are co-partitioned (cf. KAFKA-7293).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mjsax Matthias J. Sax
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: