Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-14228

Lost executor of RPC disassociated, and occurs exception: Could not find CoarseGrainedScheduler or it has been stopped

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.3.0
    • Component/s: None
    • Labels:
      None

      Description

      When I start 1000 executors, and then stop the process. It will call SparkContext.stop to stop all executors. But during this process, the executors has been killed will lost of rpc with driver, and try to reviveOffers, but can't find CoarseGrainedScheduler or it has been stopped.

      16/03/29 01:45:45 ERROR YarnScheduler: Lost executor 610 on 51-196-152-8: remote Rpc client disassociated
      16/03/29 01:45:45 ERROR Inbox: Ignoring error
      org.apache.spark.SparkException: Could not find CoarseGrainedScheduler or it has been stopped.
      at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:161)
      at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:131)
      at org.apache.spark.rpc.netty.NettyRpcEnv.send(NettyRpcEnv.scala:173)
      at org.apache.spark.rpc.netty.NettyRpcEndpointRef.send(NettyRpcEnv.scala:398)
      at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.reviveOffers(CoarseGrainedSchedulerBackend.scala:314)
      at org.apache.spark.scheduler.TaskSchedulerImpl.executorLost(TaskSchedulerImpl.scala:482)
      at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint.removeExecutor(CoarseGrainedSchedulerBackend.scala:261)
      at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint$$anonfun$onDisconnected$1.apply(CoarseGrainedSchedulerBackend.scala:207)
      at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint$$anonfun$onDisconnected$1.apply(CoarseGrainedSchedulerBackend.scala:207)
      at scala.Option.foreach(Option.scala:236)
      at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint.onDisconnected(CoarseGrainedSchedulerBackend.scala:207)
      at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:144)
      at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
      at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
      at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              meiyoula meiyoula
            • Votes:
              7 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: