Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-9801

Spark streaming deletes the temp file and backup files without checking if they exist or not

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4.1
    • Fix Version/s: 1.3.2, 1.4.2, 1.5.0
    • Component/s: DStreams
    • Labels:
      None

      Description

      For spark streaming, when checkpoint is happening, it is getting below error message from spark driver log:

      15/07/29 11:04:50 INFO CheckpointWriter: Saving checkpoint for time 1438135490000 ms to file 'maprfs:/user/mapr/spark-checkpoint2/checkpoint-1438135490000' 
      15/07/29 11:04:50 ERROR MapRFileSystem: Failed to delete path maprfs:/user/mapr/spark-checkpoint2/temp, error: No such file or directory (2) 
      15/07/29 11:04:50 ERROR MapRFileSystem: Failed to delete path maprfs:/user/mapr/spark-checkpoint2/checkpoint-1438135490000.bk, error: No such file or directory (2) 
      15/07/29 11:04:50 INFO CheckpointWriter: Deleting maprfs:///user/mapr/spark-checkpoint2/checkpoint-1438135480000 
      15/07/29 11:04:50 INFO CheckpointWriter: Checkpoint for time 1438135490000 ms saved to file 'maprfs:/user/mapr/spark-checkpoint2/checkpoint-1438135490000', took 8729 bytes and 14 ms 
      15/07/29 11:04:50 INFO DStreamGraph: Clearing checkpoint data for time 1438135490000 ms 
      15/07/29 11:04:50 INFO DStreamGraph: Cleared checkpoint data for time 1438135490000 ms
      

      From the source code :
      https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala

      When Spark tries to delete the 2 files, it did not check if the 2 files exist or not.
      fs.delete(tempFile, true) // just in case it exists
      fs.delete(backupFile, true) // just in case it exists

      We should add the logic to check if the files exist or not before deleting.

        Attachments

          Activity

            People

            • Assignee:
              haozhu Hao Zhu
              Reporter:
              haozhu Hao Zhu
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: