[FLINK-18647] How to handle processing time timers with bounded input - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Not a Priority
Resolution: Unresolved
Affects Version/s: 1.11.0
Fix Version/s: None
Component/s: API / DataStream
Labels:

Description

(most of this description comes from an offline discussion between me, AHeise, roman_khachatryan, aljoscha and sunhaibotb)

In case of end of input (for example for bounded sources), all pending (untriggered) processing time timers are ignored/dropped. In some cases this is desirable, but for example for WindowOperator it means that last trailing window will not be triggered, causing an apparent data loss.

There are a couple of ideas what should be considered.

1. Provide a way for users to decide what to do with such timers: cancel, wait, trigger immediately. For example by overloading the existing methods: ProcessingTimeService#registerTimer and ProcessingTimeService#scheduleAtFixedRate in the following way:

ScheduledFuture<?> registerTimer(long timestamp, ProcessingTimeCallback target, TimerAction timerAction);

enum TimerAction { 
    CANCEL_ON_END_OF_INPUT, 
    TRIGGER_ON_END_OF_INPUT,
    WAIT_ON_END_OF_INPUT}

or maybe:

public interface TimerAction {
    void onEndOfInput(ScheduledFuture<?> timer);
}

But this would also mean we store additional state with each timer and we need to modify the serialisation format (providing some kind of state migration path) and potentially increase the size foot print of the timers.

Extra overhead could have been avoided via some kind of Map<Timer, TimerAction>, with lack of entry meaning some default value.

Also another way to solve this problem might be let the operator code decide what to do with the given timer.

a. Either ask an operator what should happen with given timer,
b. or let the operator iterate and cancel the timers on endOfInput(),
c. or just fire the timer with some endOfInput flag.

I think none of the (a), (b), and (c) would require braking API changes, no state changes and no additional overheads. Just the logic what to do with the timer would have to be “hardcoded” in the operator’s code. (which btw might even has an additional benefit of being easier to change in case of some bugs, like a timer was registered with wrong/incorrect TimerAction).

This is complicated a bit by a question, how (if at all?) options a), b) or c) should be exposed to UDFs?

Maybe we need a combination of both? Pre existing operators could implement some custom handling of this issue (via 2a, 2b or 2c), while UDFs could be handled by 1.?

Attachments

Issue Links

Blocked

FLINK-29183 flink job ended before window time up while using bounded stream source

Closed

causes

FLINK-29183 flink job ended before window time up while using bounded stream source

Closed

is related to

FLINK-14228 The runtime support for Bounded[One|Multi]Input#endInput does not properly implement their semantics

Closed

mentioned in: Page Loading...

Activity

People

Assignee:: Unassigned

Reporter:: Piotr Nowojski

Votes:: 3 Vote for this issue

Watchers:: 26 Start watching this issue

Dates

Created:: 20/Jul/20 15:32

Updated:: 10/Nov/22 12:53