Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2243

Support multiple SparkContexts in the same JVM

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.7.0, 1.0.0, 1.1.0
    • Fix Version/s: None
    • Component/s: Block Manager, Spark Core
    • Labels:
      None

      Description

      We're developing a platform where we create several Spark contexts for carrying out different calculations. Is there any restriction when using several Spark contexts? We have two contexts, one for Spark calculations and another one for Spark Streaming jobs. The next error arises when we first execute a Spark calculation and, once the execution is finished, a Spark Streaming job is launched:

      14/06/23 16:40:08 ERROR executor.Executor: Exception in task ID 0
      java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0
      	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624)
      	at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156)
      	at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
      	at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
      	at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
      	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193)
      	at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      14/06/23 16:40:08 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0)
      14/06/23 16:40:08 WARN scheduler.TaskSetManager: Loss was due to java.io.FileNotFoundException
      java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0
      	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624)
      	at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156)
      	at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
      	at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
      	at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
      	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193)
      	at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      14/06/23 16:40:08 ERROR scheduler.TaskSetManager: Task 0.0:0 failed 1 times; aborting job
      14/06/23 16:40:08 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
      14/06/23 16:40:08 INFO scheduler.DAGScheduler: Failed to run runJob at NetworkInputTracker.scala:182
      [WARNING] 
      org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed 1 times (most recent failure: Exception failure: java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)
      	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
      	at scala.Option.foreach(Option.scala:236)
      	at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)
      	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
      	at akka.actor.ActorCell.invoke(ActorCell.scala:456)
      	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
      	at akka.dispatch.Mailbox.run(Mailbox.scala:219)
      	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:385)
      	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      14/06/23 16:40:09 INFO dstream.ForEachDStream: metadataCleanupDelay = 3600
      

      So far, we are working on localhost. Any clue about where this error is coming from? Any workaround to solve the issue?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                mafernandez Miguel Angel Fernandez Diaz
              • Votes:
                38 Vote for this issue
                Watchers:
                81 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: