Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
1.14.0
Description
When stop the job with savepoint, if there is a task is finishing, the action will be timeout.
Flink conf:
state.savepoints.dir: file:///tmp/flink-savepoints state.backend: rocksdb state.backend.incremental: true state.checkpoints.dir: file:///tmp/flink-ckp/ execution.checkpointing.aligned-checkpoint-timeout: 30 s execution.checkpointing.interval: 5 s taskmanager.numberOfTaskSlots: 2 execution.checkpointing.checkpoints-after-tasks-finish.enabled: true
How to reproduce:
bin/flink run -d -p 4 examples/streaming/WordCount.jar
# while one task is finishing
bin/flink stop $JOB_ID
Client log:
------------------------------------------------------------ The program finished with the following exception: org.apache.flink.util.FlinkException: Could not stop with a savepoint job "e139a2eba7f8dc0b07fab65e84421ee4". at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:581) at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1002) at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:569) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1069) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) Caused by: java.util.concurrent.TimeoutException at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:579) ... 6 more
Attachments
Attachments
Issue Links
- is caused by
-
FLINK-2491 Support Checkpoints After Tasks Finished
- Closed
- is related to
-
FLINK-23532 Unify stop-with-savepoint w drain and w/o drain
- Closed
- links to