Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
For a merge() operator we check at compile time, if one of the input KStreams requires repartitioning and set the "requiresRepartitioning" flag for the output KStream for this case. This implies, that we pipe data from all input KStreams through the repartition topic after the merge().
Using our optimizer, we could push down the repartition operation before the merge() to only repartition the KStream(s) that required repartition and thus save network IO for all KStreams that don't require repartitioning.
Note, that this operation is only correct, if all input streams are co-partitioned (cf. KAFKA-7293).
Attachments
Issue Links
- is blocked by
-
KAFKA-7293 Merge followed by groupByKey/join might violate co-partioning
- Open