Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16513

Spark executor deadlocks itself in memory management

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.6.1
    • None
    • Spark Core
    • None

    Description

      I have a spark streaming application which uses stateful RDDs (2 to be exact), but a given job only uses one. The last part of the executor stderr log is enclosed. There is no output in stdout. There are 3 concurrent Spark tasks on the executor deadlocked as follows:

      org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1029)
      org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1009)
      org.apache.spark.storage.MemoryStore$$anonfun$evictBlocksToFreeSpace$2.apply(MemoryStore.scala:460)
      org.apache.spark.storage.MemoryStore$$anonfun$evictBlocksToFreeSpace$2.apply(MemoryStore.scala:449)
      scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      org.apache.spark.storage.MemoryStore.evictBlocksToFreeSpace(MemoryStore.scala:449)
      org.apache.spark.memory.StorageMemoryPool.acquireMemory(StorageMemoryPool.scala:89)
      org.apache.spark.memory.StorageMemoryPool.acquireMemory(StorageMemoryPool.scala:69)
      org.apache.spark.memory.UnifiedMemoryManager.acquireStorageMemory(UnifiedMemoryManager.scala:155)
      org.apache.spark.memory.UnifiedMemoryManager.acquireUnrollMemory(UnifiedMemoryManager.scala:162)
      org.apache.spark.storage.MemoryStore.reserveUnrollMemoryForThisTask(MemoryStore.scala:493)
      org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:291)
      org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
      org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
      org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
      org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
      org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
      org.apache.spark.scheduler.Task.run(Task.scala:89)
      org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
      java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      java.lang.Thread.run(Thread.java:745)

      org.apache.spark.storage.MemoryStore.tryToPut(MemoryStore.scala:379)
      org.apache.spark.storage.MemoryStore.tryToPut(MemoryStore.scala:346)
      org.apache.spark.storage.MemoryStore.putArray(MemoryStore.scala:133)
      org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:800)
      org.apache.spark.storage.BlockManager.putArray(BlockManager.scala:676)
      org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:175)
      org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
      org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
      org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
      org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
      org.apache.spark.scheduler.Task.run(Task.scala:89)
      org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
      java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      java.lang.Thread.run(Thread.java:745)

      org.apache.spark.memory.MemoryManager.releaseExecutionMemory(MemoryManager.scala:120)
      org.apache.spark.memory.TaskMemoryManager.releaseExecutionMemory(TaskMemoryManager.java:201)
      org.apache.spark.util.collection.Spillable$class.releaseMemory(Spillable.scala:111)
      org.apache.spark.util.collection.ExternalSorter.releaseMemory(ExternalSorter.scala:89)
      org.apache.spark.util.collection.ExternalSorter.stop(ExternalSorter.scala:694)
      org.apache.spark.shuffle.sort.SortShuffleWriter.stop(SortShuffleWriter.scala:95)
      org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:74)
      org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
      org.apache.spark.scheduler.Task.run(Task.scala:89)
      org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
      java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      java.lang.Thread.run(Thread.java:745)

      This is the log file exerpt:

      Attachments

        1. sparklog
          100 kB
          Steven Lowenthal
        2. screenshot-1.png
          28 kB
          Steven Lowenthal
        3. hung.executor.stack.txt
          136 kB
          Steven Lowenthal
        4. driver.stack.txt
          83 kB
          Steven Lowenthal

        Issue Links

          Activity

            People

              Unassigned Unassigned
              slowenthal Steven Lowenthal
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: