Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20894

Error while checkpointing to HDFS

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.1
    • 2.3.0
    • Structured Streaming
    • None
    • Ubuntu, Spark 2.1.1, hadoop 2.7

    Description

      Dataset<Row> df2 = df1.groupBy(functions.window(df1.col("Timestamp5"), "24 hours", "24 hours"), df1.col("AppName")).count();

      StreamingQuery query = df2.writeStream().foreach(new KafkaSink()).option("checkpointLocation","/usr/local/hadoop/checkpoint").outputMode("update").start();
      query.awaitTermination();

      This for some reason fails with the Error

      ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
      java.lang.IllegalStateException: Error reading delta file /usr/local/hadoop/checkpoint/state/0/0/1.delta of HDFSStateStoreProvider[id = (op=0, part=0), dir = /usr/local/hadoop/checkpoint/state/0/0]: /usr/local/hadoop/checkpoint/state/0/0/1.delta does not exist

      I did clear all the checkpoint data in /usr/local/hadoop/checkpoint/ and all consumer offsets in Kafka from all brokers prior to running and yet this error still persists.

      Attachments

        1. executor1_log
          122 kB
          kant kodali
        2. executor2_log
          6 kB
          kant kodali
        3. driver_info_log
          177 kB
          kant kodali

        Activity

          People

            zsxwing Shixiong Zhu
            kant kant kodali
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: