Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32269

Failed to rename delta file on checkpoint path

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.3.0
    • None
    • Deploy
    • None

    Description

      Hi Team, 
      I got an exception that :

      // code placeholder
      Aborting task
      java.lang.IllegalStateException: Error committing version 414 into HDFSStateStore[id=(op=0,part=190),dir=hdfs://alfslc/user/veda/veda_checkpoint/vedaaltus10thserv_0.0.1_202007100508/state/0/190]
      	at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$HDFSBackedStateStore.commit(HDFSBackedStateStoreProvider.scala:136)
      	at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExec$$anonfun$doExecute$1$$anonfun$apply$1.apply$mcV$sp(FlatMapGroupsWithStateExec.scala:142)
      	at org.apache.spark.util.CompletionIterator$$anon$1.completion(CompletionIterator.scala:44)
      	at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:33)
      	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown Source)
      	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
      	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:891)
      	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.foreach(WholeStageCodegenExec.scala:612)
      	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2.scala:130)
      	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2.scala:129)
      	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)
      	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2.scala:135)
      	at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$2.apply(WriteToDataSourceV2.scala:79)
      	at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$2.apply(WriteToDataSourceV2.scala:78)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
      	at org.apache.spark.scheduler.Task.run(Task.scala:109)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.IOException: Failed to rename hdfs://alfslc/user/veda/veda_checkpoint/vedaaltus10thserv_0.0.1_202007100508/state/0/190/temp-5581210200987871585 to hdfs://alfslc/user/veda/veda_checkpoint/vedaaltus10thserv_0.0.1_202007100508/state/0/190/414.delta
      	at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$commitUpdates(HDFSBackedStateStoreProvider.scala:275)
      	at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$HDFSBackedStateStore.commit(HDFSBackedStateStoreProvider.scala:130)
      	... 20 more
      

      It often occurs in production env and my Hadoop version is :

       

      Hadoop Version:2.7.3.2.6.5.0-292

      Spark version: 2.3.0

      I used the mapGroupWithState and stored the state.

      Attachments

        Activity

          People

            Unassigned Unassigned
            ElvisQaQ Elvis
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: