Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2243

Support multiple SparkContexts in the same JVM

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 0.7.0, 1.0.0, 1.1.0
    • None
    • Block Manager, Spark Core
    • None

    Description

      We're developing a platform where we create several Spark contexts for carrying out different calculations. Is there any restriction when using several Spark contexts? We have two contexts, one for Spark calculations and another one for Spark Streaming jobs. The next error arises when we first execute a Spark calculation and, once the execution is finished, a Spark Streaming job is launched:

      14/06/23 16:40:08 ERROR executor.Executor: Exception in task ID 0
      java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0
      	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624)
      	at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156)
      	at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
      	at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
      	at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
      	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193)
      	at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      14/06/23 16:40:08 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0)
      14/06/23 16:40:08 WARN scheduler.TaskSetManager: Loss was due to java.io.FileNotFoundException
      java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0
      	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624)
      	at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156)
      	at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
      	at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
      	at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
      	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193)
      	at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      14/06/23 16:40:08 ERROR scheduler.TaskSetManager: Task 0.0:0 failed 1 times; aborting job
      14/06/23 16:40:08 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
      14/06/23 16:40:08 INFO scheduler.DAGScheduler: Failed to run runJob at NetworkInputTracker.scala:182
      [WARNING] 
      org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed 1 times (most recent failure: Exception failure: java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)
      	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
      	at scala.Option.foreach(Option.scala:236)
      	at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)
      	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
      	at akka.actor.ActorCell.invoke(ActorCell.scala:456)
      	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
      	at akka.dispatch.Mailbox.run(Mailbox.scala:219)
      	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:385)
      	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      14/06/23 16:40:09 INFO dstream.ForEachDStream: metadataCleanupDelay = 3600
      

      So far, we are working on localhost. Any clue about where this error is coming from? Any workaround to solve the issue?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mafernandez Miguel Angel Fernandez Diaz
              Votes:
              38 Vote for this issue
              Watchers:
              72 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: