Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11051

NullPointerException when action called on localCheckpointed RDD (that was checkpointed before)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.5.0
    • Fix Version/s: 1.5.2, 1.6.0
    • Component/s: Spark Core
    • Labels:
      None
    • Environment:

      Spark version 1.6.0-SNAPSHOT built from the sources as of today - Oct, 10th

      Description

      While toying with RDD.checkpoint and RDD.localCheckpoint methods, the following NullPointerException was thrown:

      scala> lines.count
      java.lang.NullPointerException
        at org.apache.spark.rdd.RDD.firstParent(RDD.scala:1587)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1927)
        at org.apache.spark.rdd.RDD.count(RDD.scala:1115)
        ... 48 elided
      

      To reproduce the issue do the following:

      $ ./bin/spark-shell
      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /___/ .__/\_,_/_/ /_/\_\   version 1.6.0-SNAPSHOT
            /_/
      
      Using Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60)
      Type in expressions to have them evaluated.
      Type :help for more information.
      
      scala> val lines = sc.textFile("README.md")
      lines: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:24
      
      scala> sc.setCheckpointDir("checkpoints")
      
      scala> lines.checkpoint
      
      scala> lines.count
      res2: Long = 98
      
      scala> lines.localCheckpoint
      15/10/10 22:59:20 WARN MapPartitionsRDD: RDD was already marked for reliable checkpointing: overriding with local checkpoint.
      res4: lines.type = MapPartitionsRDD[1] at textFile at <console>:24
      
      scala> lines.count
      java.lang.NullPointerException
        at org.apache.spark.rdd.RDD.firstParent(RDD.scala:1587)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1927)
        at org.apache.spark.rdd.RDD.count(RDD.scala:1115)
        ... 48 elided
      

        Attachments

          Activity

            People

            • Assignee:
              viirya L. C. Hsieh
              Reporter:
              jlaskowski Jacek Laskowski
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: