Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
After FLINK-26049, all failed checkpoint would print message with {{ Failed to trigger checkpoint }}:
5812 [pool-5-thread-1] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering checkpoint 1 (type=CheckpointType{name='Checkpoint', sharingFilesStrategy=FORWARD_BACKWARD}) @ 1646825286424 for job d2fd07b3b33af453a4e115f3197f81bb. 5913 [pool-5-thread-1] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Checkpoint 1 of job d2fd07b3b33af453a4e115f3197f81bb expired before completing. 451518 [pool-5-thread-1] WARN org.apache.flink.runtime.checkpoint.CheckpointFailureManager [] - Failed to trigger checkpoint 1 for job d2fd07b3b33af453a4e115f3197f81bb. (0 consecutive failed attempts so far) org.apache.flink.runtime.checkpoint.CheckpointException: Checkpoint expired before completing. at org.apache.flink.runtime.checkpoint.CheckpointCoordinator$CheckpointCanceller.run(CheckpointCoordinator.java:2172) [classes/:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_292] at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) [?:1.8.0_292] at java.util.concurrent.FutureTask.run(FutureTask.java) [?:1.8.0_292] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_292] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_292] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_292] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_292] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
This is extremely strange as the failure does not happen during the trigger phase.
Attachments
Issue Links
- is caused by
-
FLINK-26049 The tolerable-failed-checkpoints logic is invalid when checkpoint trigger failed
- Closed
- links to