Details
-
Technical Debt
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.14.0, 1.13.2
-
None
Description
We need to re-examine the startup procedure of the scheduler, and how it interacts with the startup of the operator coordinators.
We need to make sure the following conditions are met:
- The Operator Coordinators are started before the first action happens that they need to be informed of. That includes as task being ready, a checkpoint happening, etc.
- The scheduler must be started to the point that it can handle "failGlobal()" calls, because the coordinators might trigger that during their startup when an exception in "start()" occurs.
/cc chesnay
Attachments
Issue Links
- is caused by
-
FLINK-24303 SourceCoordinator exception may fail Session Cluster
- Closed
- is related to
-
FLINK-24303 SourceCoordinator exception may fail Session Cluster
- Closed