Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22243

streaming job failed to restart from checkpoint

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.1.0, 2.2.0
    • Fix Version/s: 2.2.1, 2.3.0
    • Component/s: DStreams
    • Labels:
      None

      Description

      My spark-defaults.conf has an item related to the issue, I upload all jars in spark's jars folder to the hdfs path:
      spark.yarn.jars hdfs:///spark/cache/spark2.2/*

      Streaming job failed to restart from checkpoint, ApplicationMaster throws "Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher". The problem is always reproducible.

      I examine the sparkconf object recovered from checkpoint, and find spark.yarn.jars are set empty, which let all jars not exist in AM side. The solution is spark.yarn.jars should be reload from properties files when recovering from checkpoint.

      attach is a demo to reproduce the issue.

        Attachments

        1. CheckpointTest.scala
          0.9 kB
          StephenZou

          Issue Links

            Activity

              People

              • Assignee:
                desmoon StephenZou
                Reporter:
                desmoon StephenZou
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: