Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1786

Kryo Serialization Error in GraphX

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.0
    • 1.0.0
    • GraphX
    • None

    Description

      The following code block will generate a serialization error when run in the spark-shell with Kryo enabled:

      import org.apache.spark.storage._
      import org.apache.spark.graphx._
      import org.apache.spark.graphx.util._
      
      val g = GraphGenerators.gridGraph(sc, 100, 100)
      val e = g.edges
      e.persist(StorageLevel.MEMORY_ONLY_SER)
      e.collect().foreach(println(_)) // <- Runs successfully the first time.
      
      // The following line will fail:
      e.collect().foreach(println(_))
      

      The following error is generated:

      scala> e.collect().foreach(println(_))
      14/05/09 18:31:13 INFO SparkContext: Starting job: collect at EdgeRDD.scala:59
      14/05/09 18:31:13 INFO DAGScheduler: Got job 1 (collect at EdgeRDD.scala:59) with 8 output partitions (allowLocal=false)
      14/05/09 18:31:13 INFO DAGScheduler: Final stage: Stage 1(collect at EdgeRDD.scala:59)
      14/05/09 18:31:13 INFO DAGScheduler: Parents of final stage: List()
      14/05/09 18:31:13 INFO DAGScheduler: Missing parents: List()
      14/05/09 18:31:13 INFO DAGScheduler: Submitting Stage 1 (MappedRDD[15] at map at EdgeRDD.scala:59), which has no missing parents
      14/05/09 18:31:13 INFO DAGScheduler: Submitting 8 missing tasks from Stage 1 (MappedRDD[15] at map at EdgeRDD.scala:59)
      14/05/09 18:31:13 INFO TaskSchedulerImpl: Adding task set 1.0 with 8 tasks
      14/05/09 18:31:13 INFO TaskSetManager: Starting task 1.0:0 as TID 8 on executor localhost: localhost (PROCESS_LOCAL)
      14/05/09 18:31:13 INFO TaskSetManager: Serialized task 1.0:0 as 1779 bytes in 3 ms
      14/05/09 18:31:13 INFO TaskSetManager: Starting task 1.0:1 as TID 9 on executor localhost: localhost (PROCESS_LOCAL)
      14/05/09 18:31:13 INFO TaskSetManager: Serialized task 1.0:1 as 1779 bytes in 4 ms
      14/05/09 18:31:13 INFO TaskSetManager: Starting task 1.0:2 as TID 10 on executor localhost: localhost (PROCESS_LOCAL)
      14/05/09 18:31:13 INFO TaskSetManager: Serialized task 1.0:2 as 1779 bytes in 4 ms
      14/05/09 18:31:13 INFO TaskSetManager: Starting task 1.0:3 as TID 11 on executor localhost: localhost (PROCESS_LOCAL)
      14/05/09 18:31:13 INFO TaskSetManager: Serialized task 1.0:3 as 1779 bytes in 4 ms
      14/05/09 18:31:13 INFO TaskSetManager: Starting task 1.0:4 as TID 12 on executor localhost: localhost (PROCESS_LOCAL)
      14/05/09 18:31:13 INFO TaskSetManager: Serialized task 1.0:4 as 1779 bytes in 3 ms
      14/05/09 18:31:13 INFO TaskSetManager: Starting task 1.0:5 as TID 13 on executor localhost: localhost (PROCESS_LOCAL)
      14/05/09 18:31:13 INFO TaskSetManager: Serialized task 1.0:5 as 1782 bytes in 4 ms
      14/05/09 18:31:13 INFO TaskSetManager: Starting task 1.0:6 as TID 14 on executor localhost: localhost (PROCESS_LOCAL)
      14/05/09 18:31:13 INFO TaskSetManager: Serialized task 1.0:6 as 1783 bytes in 4 ms
      14/05/09 18:31:13 INFO TaskSetManager: Starting task 1.0:7 as TID 15 on executor localhost: localhost (PROCESS_LOCAL)
      14/05/09 18:31:13 INFO TaskSetManager: Serialized task 1.0:7 as 1783 bytes in 4 ms
      14/05/09 18:31:13 INFO Executor: Running task ID 9
      14/05/09 18:31:13 INFO Executor: Running task ID 8
      14/05/09 18:31:13 INFO Executor: Running task ID 11
      14/05/09 18:31:13 INFO Executor: Running task ID 14
      14/05/09 18:31:13 INFO Executor: Running task ID 10
      14/05/09 18:31:13 INFO Executor: Running task ID 13
      14/05/09 18:31:13 INFO Executor: Running task ID 15
      14/05/09 18:31:13 INFO Executor: Running task ID 12
      14/05/09 18:31:13 INFO BlockManager: Found block rdd_12_6 locally
      14/05/09 18:31:13 INFO BlockManager: Found block rdd_12_4 locally
      14/05/09 18:31:13 INFO BlockManager: Found block rdd_12_2 locally
      14/05/09 18:31:13 INFO BlockManager: Found block rdd_12_7 locally
      14/05/09 18:31:13 INFO BlockManager: Found block rdd_12_1 locally
      14/05/09 18:31:13 INFO BlockManager: Found block rdd_12_3 locally
      14/05/09 18:31:13 INFO BlockManager: Found block rdd_12_0 locally
      14/05/09 18:31:13 INFO BlockManager: Found block rdd_12_5 locally
      14/05/09 18:31:13 ERROR Executor: Exception in task ID 13
      java.lang.NullPointerException
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:269)
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:262)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
      	at org.apache.spark.scheduler.Task.run(Task.scala:51)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      14/05/09 18:31:13 ERROR Executor: Exception in task ID 10
      java.lang.NullPointerException
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:269)
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:262)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
      	at org.apache.spark.scheduler.Task.run(Task.scala:51)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      14/05/09 18:31:13 ERROR Executor: Exception in task ID 11
      java.lang.NullPointerException
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:269)
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:262)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
      	at org.apache.spark.scheduler.Task.run(Task.scala:51)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      14/05/09 18:31:13 ERROR Executor: Exception in task ID 12
      java.lang.NullPointerException
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:269)
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:262)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
      	at org.apache.spark.scheduler.Task.run(Task.scala:51)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      14/05/09 18:31:13 ERROR Executor: Exception in task ID 15
      java.lang.NullPointerException
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:269)
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:262)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
      	at org.apache.spark.scheduler.Task.run(Task.scala:51)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      14/05/09 18:31:13 ERROR Executor: Exception in task ID 8
      java.lang.NullPointerException
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:269)
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:262)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
      	at org.apache.spark.scheduler.Task.run(Task.scala:51)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      14/05/09 18:31:13 ERROR Executor: Exception in task ID 9
      java.lang.NullPointerException
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:269)
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:262)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
      	at org.apache.spark.scheduler.Task.run(Task.scala:51)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      14/05/09 18:31:13 ERROR Executor: Exception in task ID 14
      java.lang.NullPointerException
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:269)
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:262)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
      	at org.apache.spark.scheduler.Task.run(Task.scala:51)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      14/05/09 18:31:13 WARN TaskSetManager: Lost TID 11 (task 1.0:3)
      14/05/09 18:31:13 WARN TaskSetManager: Loss was due to java.lang.NullPointerException
      java.lang.NullPointerException
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:269)
      	at org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:262)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
      	at org.apache.spark.scheduler.Task.run(Task.scala:51)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      14/05/09 18:31:13 ERROR TaskSetManager: Task 1.0:3 failed 1 times; aborting job
      14/05/09 18:31:13 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
      14/05/09 18:31:13 INFO TaskSetManager: Loss was due to java.lang.NullPointerException [duplicate 1]
      14/05/09 18:31:13 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
      14/05/09 18:31:13 INFO TaskSetManager: Loss was due to java.lang.NullPointerException [duplicate 2]
      14/05/09 18:31:13 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
      14/05/09 18:31:13 INFO TaskSetManager: Loss was due to java.lang.NullPointerException [duplicate 3]
      14/05/09 18:31:13 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
      14/05/09 18:31:13 INFO TaskSetManager: Loss was due to java.lang.NullPointerException [duplicate 4]
      14/05/09 18:31:13 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
      14/05/09 18:31:13 INFO DAGScheduler: Failed to run collect at EdgeRDD.scala:59
      14/05/09 18:31:13 INFO TaskSchedulerImpl: Cancelling stage 1
      14/05/09 18:31:13 INFO TaskSetManager: Loss was due to java.lang.NullPointerException [duplicate 5]
      14/05/09 18:31:13 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
      14/05/09 18:31:13 INFO TaskSetManager: Loss was due to java.lang.NullPointerException [duplicate 6]
      14/05/09 18:31:13 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
      14/05/09 18:31:13 INFO TaskSetManager: Loss was due to java.lang.NullPointerException [duplicate 7]
      14/05/09 18:31:13 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
      org.apache.spark.SparkException: Job aborted due to stage failure: Task 1.0:3 failed 1 times, most recent failure: Exception failure in TID 11 on host localhost: java.lang.NullPointerException
              org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:269)
              org.apache.spark.graphx.impl.EdgePartition$$anon$1.next(EdgePartition.scala:262)
              scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
              scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
              scala.collection.Iterator$class.foreach(Iterator.scala:727)
              scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
              scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
              scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
              scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
              scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
              scala.collection.AbstractIterator.to(Iterator.scala:1157)
              scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
              scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
              scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
              scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
              org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
              org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:706)
              org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
              org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1071)
              org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
              org.apache.spark.scheduler.Task.run(Task.scala:51)
              org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:208)
              java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              java.lang.Thread.run(Thread.java:744)
      Driver stacktrace:
      	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
      	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
      	at scala.Option.foreach(Option.scala:236)
      	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
      	at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
      	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
      	at akka.actor.ActorCell.invoke(ActorCell.scala:456)
      	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
      	at akka.dispatch.Mailbox.run(Mailbox.scala:219)
      	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
      	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107
      

      We believe the error is associated with serialization of the EdgePartition.

      Attachments

        Activity

          People

            jegonzal Joseph E. Gonzalez
            jegonzal Joseph E. Gonzalez
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: