Details
Description
Case
Streams application with two input topics being used for a left join.
When the left side topic is missing upon starting the streams application, it hangs "in the middle" of the topology (at …00009, see below). Only parts of the intermediate topics are created (up to …00009)
When the missing input topic is created, the streams application resumes processing.
Topology: StreamsTask taskId: 2_0 ProcessorTopology: KSTREAM-SOURCE-0000000011: topics: [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-0000000009-repartition] children: [KTABLE-AGGREGATE-0000000012] KTABLE-AGGREGATE-0000000012: states: [KTABLE-AGGREGATE-STATE-STORE-0000000009] children: [KTABLE-TOSTREAM-0000000020] KTABLE-TOSTREAM-0000000020: children: [KSTREAM-SINK-0000000021] KSTREAM-SINK-0000000021: topic: data_udr_month_customer_aggregration KSTREAM-SOURCE-0000000017: topics: [mystreams_app-KSTREAM-MAP-0000000014-repartition] children: [KSTREAM-LEFTJOIN-0000000018] KSTREAM-LEFTJOIN-0000000018: states: [KTABLE-AGGREGATE-STATE-STORE-0000000009] children: [KSTREAM-SINK-0000000019] KSTREAM-SINK-0000000019: topic: data_UDR_joined Partitions [mystreams_app-KSTREAM-MAP-0000000014-repartition-0, mystreams_app-KTABLE-AGGREGATE-STATE-STORE-0000000009-repartition-0]
Why this matters
The applications does quite a lot of preprocessing before joining with the missing input topic. This preprocessing won't happen without the topic, creating a huge backlog of data.
Fix
Issue an `warn` or `error` level message at start to inform about the missing topic and it's consequences.
Attachments
Issue Links
- Blocked
-
KAFKA-6720 Inconsistent Kafka Streams behaviour when topic does not exist
- Resolved
- relates to
-
KAFKA-5037 Infinite loop if all input topics are unknown at startup
- Resolved