Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.10.0
-
None
Description
When executing the RollingSinkFaultToleranceITCase with Hadoop 2.7.1, then the test either does not finish because it's stuck in an endless restart loop with the following exception
java.lang.Exception: Could not restore checkpointed state to operators and functions at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreStateLazy(StreamTask.java:414) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.Exception: Failed to restore state to function: In-Progress file hdfs://127.0.0.1:52884/string-non-rolling-out/part-0-1 was neither moved to pending nor is still in progress. at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.restoreState(AbstractUdfStreamOperator.java:165) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreStateLazy(StreamTask.java:406) ... 3 more Caused by: java.lang.RuntimeException: In-Progress file hdfs://127.0.0.1:52884/string-non-rolling-out/part-0-1 was neither moved to pending nor is still in progress. at org.apache.flink.streaming.connectors.fs.RollingSink.restoreState(RollingSink.java:670) at org.apache.flink.streaming.connectors.fs.RollingSink.restoreState(RollingSink.java:120) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.restoreState(AbstractUdfStreamOperator.java:162) ... 4 more
or it fails because the number of read strings differs from the exactly-once result (some strings are read multiple times).