Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2243

Support multiple SparkContexts in the same JVM

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 0.7.0, 1.0.0, 1.1.0
    • None
    • Block Manager, Spark Core
    • None

    Description

      We're developing a platform where we create several Spark contexts for carrying out different calculations. Is there any restriction when using several Spark contexts? We have two contexts, one for Spark calculations and another one for Spark Streaming jobs. The next error arises when we first execute a Spark calculation and, once the execution is finished, a Spark Streaming job is launched:

      14/06/23 16:40:08 ERROR executor.Executor: Exception in task ID 0
      java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0
      	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624)
      	at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156)
      	at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
      	at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
      	at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
      	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193)
      	at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      14/06/23 16:40:08 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0)
      14/06/23 16:40:08 WARN scheduler.TaskSetManager: Loss was due to java.io.FileNotFoundException
      java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0
      	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624)
      	at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156)
      	at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
      	at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
      	at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
      	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
      	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193)
      	at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      14/06/23 16:40:08 ERROR scheduler.TaskSetManager: Task 0.0:0 failed 1 times; aborting job
      14/06/23 16:40:08 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
      14/06/23 16:40:08 INFO scheduler.DAGScheduler: Failed to run runJob at NetworkInputTracker.scala:182
      [WARNING] 
      org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed 1 times (most recent failure: Exception failure: java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)
      	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
      	at scala.Option.foreach(Option.scala:236)
      	at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)
      	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
      	at akka.actor.ActorCell.invoke(ActorCell.scala:456)
      	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
      	at akka.dispatch.Mailbox.run(Mailbox.scala:219)
      	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:385)
      	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      14/06/23 16:40:09 INFO dstream.ForEachDStream: metadataCleanupDelay = 3600
      

      So far, we are working on localhost. Any clue about where this error is coming from? Any workaround to solve the issue?

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            mafernandez Miguel Angel Fernandez Diaz
            Votes:
            38 Vote for this issue
            Watchers:
            82 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment