[KAFKA-6037] Make Sub-topology Parallelism Tunable - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.6.0
Component/s: streams
Labels:
- kip

Description

Today the downstream sub-topology's parallelism (aka the number of tasks) are purely dependent on the upstream sub-topology's parallelism, which ultimately depends on the source topic's num. partitions. However this does not work perfectly with dynamic scaling scenarios.

Imagine if your have a simple aggregation application, it would have two sub-topologies cut by the repartition topic, the first sub-topology would be very light as it reads from input topics and write to repartition topic based on the agg-key; the second sub-topology would do the actual work with the agg state store, etc, hence is heavy computational. Right now the first and second topology will always have the same number of tasks as the repartition topic num.partitions is defined to be the same as the source topic num.partitions, so to scale up we have to increase the number of input topic partitions.

One way to improve on that, is to use a default large number for repartition topics and also allow users to override it (either through DSL code, or through config). Doing this different sub-topologies would have different number of tasks, i.e. parallelism units. In addition, users may also use this config to "hint" the DSL translator to NOT create the repartition topics (i.e. to not break up the sub-topologies) when she has the knowledge of the data format.

Attachments

Issue Links

is fixed by

KAFKA-8611 Add KStream#repartition operation

Resolved

mentioned in: Page Loading...; Page Loading...

Sub-Tasks

Avoid repartitioning when key change doesn't change partitions

Open

Unassigned

Activity

People

Assignee:: Levani Kokhreidze

Reporter:: Guozhang Wang

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 10/Oct/17 00:43

Updated:: 30/Apr/20 16:57

Resolved:: 09/Apr/20 23:55