Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11487

Spark Master shutdown automatically after some applications execution

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 1.5.0
    • None
    • Spark Core
    • Spark Standalone on CentOS 6.6,
      One Master and 5 worker nodes cluster (Each Node Memory: > 150 GB each, 72 cores each)

    Description

      The master logs are as follow after the spark automatic shutdown:

      15/11/02 20:50:01 INFO master.Master: Registering app PythonWordCount
      15/11/02 20:50:01 INFO master.Master: Registered app PythonWordCount with ID app-20151102205001-0025
      15/11/02 20:50:01 INFO master.Master: Launching executor app-20151102205001-0025/0 on worker worker-20151030135450-x.x.x.76-42502
      15/11/02 20:50:01 INFO master.Master: Launching executor app-20151102205001-0025/1 on worker worker-20151030135450-x.x.x.86-51916
      15/11/02 20:50:01 INFO master.Master: Launching executor app-20151102205001-0025/2 on worker worker-20151030135450-x.x.x.85-47388
      15/11/02 20:50:01 INFO master.Master: Launching executor app-20151102205001-0025/3 on worker worker-20151030125450-x.x.x.69-51604
      15/11/02 20:50:01 INFO master.Master: Launching executor app-20151102205001-0025/4 on worker worker-20151030135450-x.x.x.87-35705
      15/11/02 20:57:35 INFO master.Master: Received unregister request from application app-20151102205001-0025
      15/11/02 20:57:35 INFO master.Master: Removing app app-20151102205001-0025
      15/11/02 20:57:35 WARN master.Master: Application PythonWordCount is still in progress, it may be terminated abnormally.
      15/11/02 20:57:35 INFO spark.SecurityManager: Changing view acls to: root
      15/11/02 20:57:35 INFO spark.SecurityManager: Changing modify acls to: root
      15/11/02 20:57:35 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
      15/11/02 20:57:43 INFO master.Master: x.x.x.x:47502 got disassociated, removing it.
      15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor app-20151102205001-0025/4
      15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor app-20151102205001-0025/3
      15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor app-20151102205001-0025/0
      15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor app-20151102205001-0025/2
      15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor app-20151102205001-0025/1
      15/11/02 20:58:28 INFO master.Master: Registering app App Test
      15/11/02 20:58:28 INFO master.Master: Registered app App Test with ID app-20151102205828-0026
      15/11/02 20:58:28 INFO master.Master: Launching executor app-20151102205828-0026/0 on worker worker-20151030135450-x.x.x.76-42502
      15/11/02 20:58:28 INFO master.Master: Launching executor app-20151102205828-0026/1 on worker worker-20151030135450-x.x.x.86-51916
      15/11/02 20:58:28 INFO master.Master: Launching executor app-20151102205828-0026/2 on worker worker-20151030135450-x.x.x.85-47388
      15/11/02 20:58:28 INFO master.Master: Launching executor app-20151102205828-0026/3 on worker worker-20151030125450-x.x.x.69-51604
      15/11/02 20:58:28 INFO master.Master: Launching executor app-20151102205828-0026/4 on worker worker-20151030135450-x.x.x.87-35705
      15/11/02 20:59:35 INFO master.Master: Received unregister request from application app-20151102205828-0026
      15/11/02 20:59:35 INFO master.Master: Removing app app-20151102205828-0026
      15/11/02 20:59:35 WARN master.Master: Application App Test is still in progress, it may be terminated abnormally.
      15/11/02 20:59:35 INFO spark.SecurityManager: Changing view acls to: root
      15/11/02 20:59:35 INFO spark.SecurityManager: Changing modify acls to: root
      15/11/02 20:59:35 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
      15/11/02 21:17:46 INFO master.Master: x.x.x.x:40954 got disassociated, removing it.
      15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor app-20151102205828-0026/3
      15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor app-20151102205828-0026/1
      15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor app-20151102205828-0026/0
      15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor app-20151102205828-0026/2
      15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor app-20151102205828-0026/4
      15/11/02 21:17:46 INFO master.Master: x.x.x.x:37676 got disassociated, removing it.
      15/11/02 21:17:48 ERROR akka.ErrorMonitor: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-3] shutting down ActorSystem [sparkMaster]
      java.lang.OutOfMemoryError: Java heap space
      at com.fasterxml.jackson.core.util.BufferRecycler.calloc(BufferRecycler.java:156)
      at com.fasterxml.jackson.core.util.BufferRecycler.allocCharBuffer(BufferRecycler.java:124)
      at com.fasterxml.jackson.core.io.IOContext.allocTokenBuffer(IOContext.java:181)
      at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:830)
      at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2161)
      at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:19)
      at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:44)
      at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58)
      at org.apache.spark.deploy.master.Master.rebuildSparkUI(Master.scala:950)
      at org.apache.spark.deploy.master.Master.removeApplication(Master.scala:812)
      at org.apache.spark.deploy.master.Master.org$apache$spark$deploy$master$Master$$finishApplication(Master.scala:790)
      at org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382)
      at org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382)
      at scala.Option.foreach(Option.scala:236)
      at org.apache.spark.deploy.master.Master$$anonfun$receive$1.applyOrElse(Master.scala:382)
      at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$processMessage(AkkaRpcEnv.scala:177)
      at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$4.apply$mcV$sp(AkkaRpcEnv.scala:126)
      at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$safelyCall(AkkaRpcEnv.scala:197)
      at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1.applyOrElse(AkkaRpcEnv.scala:125)
      at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
      at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
      at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
      at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59)
      at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
      at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
      at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
      at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
      at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1.aroundReceive(AkkaRpcEnv.scala:92)
      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
      at akka.actor.ActorCell.invoke(ActorCell.scala:487)
      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
      at akka.dispatch.Mailbox.run(Mailbox.scala:220)
      15/11/02 21:17:48 ERROR actor.ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-3] shutting down ActorSystem [sparkMaster]
      java.lang.OutOfMemoryError: Java heap space
      at com.fasterxml.jackson.core.util.BufferRecycler.calloc(BufferRecycler.java:156)
      at com.fasterxml.jackson.core.util.BufferRecycler.allocCharBuffer(BufferRecycler.java:124)
      at com.fasterxml.jackson.core.io.IOContext.allocTokenBuffer(IOContext.java:181)
      at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:830)
      at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2161)
      at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:19)
      at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:44)
      at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58)
      at org.apache.spark.deploy.master.Master.rebuildSparkUI(Master.scala:950)
      at org.apache.spark.deploy.master.Master.removeApplication(Master.scala:812)
      at org.apache.spark.deploy.master.Master.org$apache$spark$deploy$master$Master$$finishApplication(Master.scala:790)
      at org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382)
      at org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382)
      at scala.Option.foreach(Option.scala:236)
      at org.apache.spark.deploy.master.Master$$anonfun$receive$1.applyOrElse(Master.scala:382)
      at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$processMessage(AkkaRpcEnv.scala:177)
      at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$4.apply$mcV$sp(AkkaRpcEnv.scala:126)
      at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$safelyCall(AkkaRpcEnv.scala:197)
      at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1.applyOrElse(AkkaRpcEnv.scala:125)
      at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
      at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
      at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
      at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59)
      at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
      at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
      at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
      at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
      at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1.aroundReceive(AkkaRpcEnv.scala:92)
      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
      at akka.actor.ActorCell.invoke(ActorCell.scala:487)
      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
      at akka.dispatch.Mailbox.run(Mailbox.scala:220)

      Attachments

        Activity

          People

            Unassigned Unassigned
            vnayak053 Sandeep Pal
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: