Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
Task's RUNNING state was split into two states: INITIALIZING and RUNNING. Task is INITIALIZING while state is initialising and in case of unaligned checkpoints, until all of the in-flight data has been recovered.
Description
Currently a task switches to running before fully initialized, does not take state initialization and operator initialization(#open ) in to account, which may take long time to finish. As a result, there would be a weird phenomenon that all tasks are running but throughput is 0.
I think it could be good if we can expose the initialization stage of tasks. What to you think?
Attachments
Issue Links
- blocks
-
FLINK-4815 Automatic fallback to earlier checkpoints when checkpoint restore fails
- Open
- causes
-
FLINK-22535 Resource leak would happen if exception thrown during AbstractInvokable#restore of task life
- Closed
-
FLINK-23034 NPE in JobDetailsDeserializer during the reading old version of ExecutionState
- Closed
-
FLINK-22215 Rename RECOVERING state to INITIALIZING
- Resolved
- is duplicated by
-
FLINK-20087 CheckpointCoordinator waits until all tasks finish initialization of states to trigger checkpoint
- Closed
-
FLINK-20912 Increase Log and Metric: Time consumed by Checkpoint Restore
- Closed
- relates to
-
FLINK-18983 Job doesn't changed to failed if close function has blocked
- Reopened
-
FLINK-22379 Do not trigger checkpoint when non source tasks are INITIALIZING
- Resolved
- supercedes
-
FLINK-4714 Set task state to RUNNING after state has been restored
- Closed
- links to