Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-718

NPE when performing action during transformation

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Done
    • 0.7.0
    • None
    • None
    • None

    Description

      Running the spark shell:
      The following code fails with a NPE when trying to collect the resulting RDD:

      val data = sc.parallelize(1 to 10)
      data.map(i => data.count).collect
      
      ERROR local.LocalScheduler: Exception in task 0
      java.lang.NullPointerException
              at spark.RDD.count(RDD.scala:490)
              at $line16.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply$mcJI$sp(<console>:15)
              at $line16.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:15)
              at $line16.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:15)
              at scala.collection.Iterator$$anon$19.next(Iterator.scala:401)
              at scala.collection.Iterator$class.foreach(Iterator.scala:772)
              at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
              at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
              at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:102)
              at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:250)
              at scala.collection.Iterator$$anon$19.toBuffer(Iterator.scala:399)
              at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:237)
              at scala.collection.Iterator$$anon$19.toArray(Iterator.scala:399)
              at spark.RDD$$anonfun$1.apply(RDD.scala:389)
              at spark.RDD$$anonfun$1.apply(RDD.scala:389)
              at spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:610)
              at spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:610)
              at spark.scheduler.ResultTask.run(ResultTask.scala:76)
              at spark.scheduler.local.LocalScheduler.runTask$1(LocalScheduler.scala:74)
              at spark.scheduler.local.LocalScheduler$$anon$1.run(LocalScheduler.scala:50)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
              at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
              at java.util.concurrent.FutureTask.run(FutureTask.java:166)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
              at java.lang.Thread.run(Thread.java:722)
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            eleaar Krzywicki
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment