Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
Input topics subscribed to using pattern subscription will currently all be grouped into the same node group, meaning the number of tasks is determined by the maximum partition count of any matching topic. This means less overhead per partition and is suitable for some scenarios, but it limits the ability to scale out by preventing further parallelization that is possible with independent partitions. We should consider making it possible for pattern subscription to create a task for every partition summed across all matching topics.
We don't necessarily want to change the default (current) behavior, but we could make this more flexible either by autoscaling based on some heuristic, or making it customizable by the user. One possibility would be to augment the Pattern based source KStream method with an optional parameter that to tell Streams how to generate tasks for that pattern, for example
public synchronized KStream stream(pattern, numTasks);
Attachments
Issue Links
- duplicates
-
KAFKA-9282 Consider more flexible node grouping for Pattern subscription
- Open