Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.1.1
-
None
-
Ubuntu, Spark 2.1.1, hadoop 2.7
Description
Dataset<Row> df2 = df1.groupBy(functions.window(df1.col("Timestamp5"), "24 hours", "24 hours"), df1.col("AppName")).count();
StreamingQuery query = df2.writeStream().foreach(new KafkaSink()).option("checkpointLocation","/usr/local/hadoop/checkpoint").outputMode("update").start();
query.awaitTermination();
This for some reason fails with the Error
ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.lang.IllegalStateException: Error reading delta file /usr/local/hadoop/checkpoint/state/0/0/1.delta of HDFSStateStoreProvider[id = (op=0, part=0), dir = /usr/local/hadoop/checkpoint/state/0/0]: /usr/local/hadoop/checkpoint/state/0/0/1.delta does not exist
I did clear all the checkpoint data in /usr/local/hadoop/checkpoint/ and all consumer offsets in Kafka from all brokers prior to running and yet this error still persists.