Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
With following DAG
A -> B -> C
C is a stateless operator. If this application is killed and restarted after long time between kill and restart, then recovery window id of C is too high compare to A and B. This is because recovery windowid is computed from current timestamp for stateless operators in updateRecoveryCheckpoints.
The problem this causes
- Operator C does not process any data till windowId of B reached to recovery window id of C.
- If other operators are not able to keep up then C gets killed because it is detected as blocked operator.