Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.15.0
Description
Local recovery relies on knowing the allocation id of the last deployment. To that end we iterate over all previous execution attempts and use the last assignedAllocationID, if any.
However, since the execution history is bounded (to, by default, 16 entries) this can lead this information being evicted.
In other words, with the default configuration (history limit = 16, restart delay = 1s) local recovery can only kick if the TM is restarted within 16 seconds.
We should decouple this information from the execution (history).
Attachments
Issue Links
- is related to
-
FLINK-27127 Local recovery is not triggered on task manager process restart
-
- Closed
-
- links to