Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Information Provided
-
1.1.4
-
None
Description
Steps to reproduce:
- setup flink cluster in HA mode
- submit job with rocksdb state backend and enableFullyAsyncSnapshots
- send some load to the job
- in the middle of processing cancel job using the command: ./flink cancel <jobId>
In the JobManager logs:
2017-02-09 13:55:49,511 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping checkpoint coordinator for job e140ad8a3deeae991a9bbe080222d3f6 2017-02-09 13:55:49,517 INFO org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - Removed job graph e140ad8a3deeae991a9bbe080222d3f6 from ZooKeeper. 2017-02-09 13:55:49,519 WARN org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore - Failed to discard checkpoint 1. java.lang.Exception: Could not discard the completed checkpoint Checkpoint 1 @ 1486648542769 for e140ad8a3deeae991a9bbe080222d3f6. at org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore$1.processResult(ZooKeeperCompletedCheckpointStore.java:308) at org.apache.flink.shaded.org.apache.curator.framework.imps.Backgrounding$1$1.run(Backgrounding.java:109) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassNotFoundException: org.apache.flink.contrib.streaming.state.RocksDBStateBackend$FinalFullyAsyncSnapshot at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:65) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1781) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at java.util.HashMap.readObject(HashMap.java:1396) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1714) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:291) at org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:58) at org.apache.flink.runtime.checkpoint.SubtaskState.discard(SubtaskState.java:85) at org.apache.flink.runtime.checkpoint.TaskState.discard(TaskState.java:147) at org.apache.flink.runtime.checkpoint.CompletedCheckpoint.discard(CompletedCheckpoint.java:102) at org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore$1.processResult(ZooKeeperCompletedCheckpointStore.java:306) ... 4 more
Looks very similar to https://issues.apache.org/jira/browse/FLINK-5468