Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-537

driver.run() returned with code DRIVER_ABORTED

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Resolution: Fixed
    • None
    • 1.0.0
    • None
    • None

    Description

      Hi there,
      When I try to run Spark on Mesos as a cluster, some error happen like this:

      ```
      ./run spark.examples.SparkPi ...:5050
      12/09/07 14:49:28 INFO spark.BoundedMemoryCache: BoundedMemoryCache.maxBytes = 994836480
      12/09/07 14:49:28 INFO spark.CacheTrackerActor: Registered actor on port 7077
      12/09/07 14:49:28 INFO spark.CacheTrackerActor: Started slave cache (size 948.8MB) on shawpc
      12/09/07 14:49:28 INFO spark.MapOutputTrackerActor: Registered actor on port 7077
      12/09/07 14:49:28 INFO spark.ShuffleManager: Shuffle dir: /tmp/spark-local-81220c47-bc43-4809-ac48-5e3e8e023c8a/shuffle
      12/09/07 14:49:28 INFO server.Server: jetty-7.5.3.v20111011
      12/09/07 14:49:28 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:57595 STARTING
      12/09/07 14:49:28 INFO spark.ShuffleManager: Local URI: http://127.0.1.1:57595
      12/09/07 14:49:28 INFO server.Server: jetty-7.5.3.v20111011
      12/09/07 14:49:28 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:60113 STARTING
      12/09/07 14:49:28 INFO broadcast.HttpBroadcast: Broadcast server started at http://127.0.1.1:60113
      12/09/07 14:49:28 INFO spark.MesosScheduler: Temp directory for JARs: /tmp/spark-d541f37c-ae35-476c-b2fc-9908b0739f50
      12/09/07 14:49:28 INFO server.Server: jetty-7.5.3.v20111011
      12/09/07 14:49:28 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:50511 STARTING
      12/09/07 14:49:28 INFO spark.MesosScheduler: JAR server started at http://127.0.1.1:50511
      12/09/07 14:49:28 INFO spark.MesosScheduler: Registered as framework ID 201209071448-846324308-5050-26925-0000
      12/09/07 14:49:29 INFO spark.SparkContext: Starting job...
      12/09/07 14:49:29 INFO spark.CacheTracker: Registering RDD ID 1 with cache
      12/09/07 14:49:29 INFO spark.CacheTrackerActor: Registering RDD 1 with 2 partitions
      12/09/07 14:49:29 INFO spark.CacheTracker: Registering RDD ID 0 with cache
      12/09/07 14:49:29 INFO spark.CacheTrackerActor: Registering RDD 0 with 2 partitions
      12/09/07 14:49:29 INFO spark.CacheTrackerActor: Asked for current cache locations
      12/09/07 14:49:29 INFO spark.MesosScheduler: Final stage: Stage 0
      12/09/07 14:49:29 INFO spark.MesosScheduler: Parents of final stage: List()
      12/09/07 14:49:29 INFO spark.MesosScheduler: Missing parents: List()
      12/09/07 14:49:29 INFO spark.MesosScheduler: Submitting Stage 0, which has no missing parents
      12/09/07 14:49:29 INFO spark.MesosScheduler: Got a job with 2 tasks
      12/09/07 14:49:29 INFO spark.MesosScheduler: Adding job with ID 0
      12/09/07 14:49:29 INFO spark.SimpleJob: Starting task 0:0 as TID 0 on slave 201209071448-846324308-5050-26925-0: shawpc (preferred)
      12/09/07 14:49:29 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and took 52 ms to serialize by spark.JavaSerializerInstance
      12/09/07 14:49:29 INFO spark.SimpleJob: Starting task 0:1 as TID 1 on slave 201209071448-846324308-5050-26925-0: shawpc (preferred)
      12/09/07 14:49:29 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and took 1 ms to serialize by spark.JavaSerializerInstance
      12/09/07 14:49:30 INFO spark.SimpleJob: Lost TID 0 (task 0:0)
      12/09/07 14:49:30 INFO spark.SimpleJob: Starting task 0:0 as TID 2 on slave 201209071448-846324308-5050-26925-0: shawpc (preferred)
      12/09/07 14:49:30 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and took 0 ms to serialize by spark.JavaSerializerInstance
      12/09/07 14:49:30 INFO spark.SimpleJob: Lost TID 1 (task 0:1)
      12/09/07 14:49:30 INFO spark.SimpleJob: Lost TID 2 (task 0:0)
      12/09/07 14:49:30 INFO spark.SimpleJob: Starting task 0:0 as TID 3 on slave 201209071448-846324308-5050-26925-0: shawpc (preferred)
      12/09/07 14:49:30 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and took 2 ms to serialize by spark.JavaSerializerInstance
      12/09/07 14:49:32 INFO spark.SimpleJob: Starting task 0:1 as TID 4 on slave 201209071448-846324308-5050-26925-0: shawpc (preferred)
      12/09/07 14:49:32 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and took 1 ms to serialize by spark.JavaSerializerInstance
      12/09/07 14:49:32 INFO spark.SimpleJob: Lost TID 3 (task 0:0)
      12/09/07 14:49:32 INFO spark.SimpleJob: Starting task 0:0 as TID 5 on slave 201209071448-846324308-5050-26925-0: shawpc (preferred)
      12/09/07 14:49:32 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and took 0 ms to serialize by spark.JavaSerializerInstance
      12/09/07 14:49:32 INFO spark.SimpleJob: Lost TID 4 (task 0:1)
      12/09/07 14:49:32 INFO spark.SimpleJob: Lost TID 5 (task 0:0)
      12/09/07 14:49:32 INFO spark.SimpleJob: Starting task 0:0 as TID 6 on slave 201209071448-846324308-5050-26925-0: shawpc (preferred)
      12/09/07 14:49:32 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and took 0 ms to serialize by spark.JavaSerializerInstance
      12/09/07 14:49:34 INFO spark.SimpleJob: Starting task 0:1 as TID 7 on slave 201209071448-846324308-5050-26925-0: shawpc (preferred)
      12/09/07 14:49:34 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and took 2 ms to serialize by spark.JavaSerializerInstance
      12/09/07 14:49:34 INFO spark.SimpleJob: Lost TID 6 (task 0:0)
      12/09/07 14:49:34 ERROR spark.SimpleJob: Task 0:0 failed more than 4 times; aborting job
      Exception in thread "Thread-50" java.io.EOFException
      at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2280)
      at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2749)
      at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:779)
      at java.io.ObjectInputStream.<init>(ObjectInputStream.java:279)
      at spark.JavaSerializerInstance$$anon$2.<init>(JavaSerializer.scala:39)
      at spark.JavaSerializerInstance.deserialize(JavaSerializer.scala:39)
      at spark.SimpleJob.taskLost(SimpleJob.scala:296)
      at spark.SimpleJob.statusUpdate(SimpleJob.scala:207)
      at spark.MesosScheduler.statusUpdate(MesosScheduler.scala:287)
      12/09/07 14:49:34 INFO spark.SimpleJob: Starting task 0:0 as TID 8 on slave 201209071448-846324308-5050-26925-0: shawpc (preferred)
      12/09/07 14:49:34 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and took 1 ms to serialize by spark.JavaSerializerInstance
      12/09/07 14:49:34 INFO spark.SimpleJob: Lost TID 7 (task 0:1)
      12/09/07 14:49:34 INFO spark.MesosScheduler: driver.run() returned with code DRIVER_ABORTED
      ```

      Attachments

        Activity

          People

            Unassigned Unassigned
            yshaw yshaw
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: