Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.14.0
-
None
-
None
Description
In SAMZA-974, we built a mechanism to support batch job with bounded data source. The feature allows Samza jobs to shut down once all the input topic partitions reach the end of stream.
This works for applications which do not produce and consume from the same streams. With the introduction of partitionBy operators, the application can output to an intermediate stream, and then consume the same stream again for further processing. Since the end of stream tokens are not carried over from the original input streams to the intermediate streams, the job won’t be able to shut down even if all the input streams reach to the end. To address this problem, we need to extend the existing end-of-stream feature to support applications with intermediate streams.
Attachments
Issue Links
- depends upon
-
SAMZA-1312 Add Control Messages and Intermediate Stream Serde
- Resolved
-
SAMZA-1321 Propagate end-of-stream messages
- Resolved
-
SAMZA-1322 Propagate watermark messages
- Resolved
-
SAMZA-1351 Separate Watermark Implementation from Propagation
- Resolved
-
SAMZA-1352 Detect the upstream tasks when we build the graph
- Resolved
-
SAMZA-1354 Pass in typed control message into EndOfStreamManager.update() and WatermarkManager.update()
- Resolved