Details
-
New Feature
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
None
Description
Today, we have insufficient control over the event time timestamp of elements output from a timer callback.
1. For an event time timer, it is the timestamp of the timer itself.
2. For a processing time timer, it is the current input watermark at the time of processing.
But for both of these, we may want to reserve the right to output a particular time, aka set a "watermark hold".
A naive implementation of a TimerWithWatermarkHold would work for making sure output is not droppable, but does not fully explain window expiration and late data/timer dropping.
In the natural interpretation of a timer as a feedback loop on a transform, timers should be viewed as another channel of input, with a watermark, and items on that channel all need event time timestamps even if they are delivered according to a different time domain.
I propose that the specification for when a timer should fire should be separated (with nice defaults) from the specification of the event time of resulting outputs. These timestamps will determine a side channel with a new "timer watermark" that constrains the output watermark.
- We still need to fire event time timers according to the input watermark, so that event time timers fire.
- Late data dropping and window expiration will be in terms of the minimum of the input watermark and the timer watermark. In this way, whenever a timer is set, the window is not going to be garbage collected.
- We will need to make sure we have a way to "wake up" a window once it is expired; this may be as simple as exhausting the timer channel as soon as the input watermark indicates expiration of a window
This is mostly aimed at end-user timers in a stateful+timely DoFn. It seems reasonable to use timers as an implementation detail (e.g. in runners-core utilities) without wanting any of this additional machinery. For example, if there is no possibility of output from the timer callback.
Attachments
Issue Links
- Blocked
-
BEAM-1589 Add OnWindowExpiration method to Stateful DoFn
- Triage Needed
- is depended upon by
-
BEAM-2925 Fn API user timer support
- Resolved
- relates to
-
BEAM-9566 Performance regression of FlinkRunner stream mode due to watermark holds update
- Resolved
- supercedes
-
BEAM-1696 Allow explicit output time for processing time timers
- Resolved
-
BEAM-1697 Allow output time that is some offset from target time of event time timer
- Resolved
- links to