Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20882

Executor is waiting for ShuffleBlockFetcherIterator

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • 2.1.0, 2.1.1
    • None
    • Spark Core
    • None

    Description

      This bug is like https://issues.apache.org/jira/browse/SPARK-19300.
      but I have updated my client netty version to 4.0.43.Final.
      The shuffle service handler is still 4.0.42.Final
      spark.sql.adaptive.enabled is true

      "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 tid=0x00007f54ef437000 nid=0x1aed0 waiting on condition [0x00007f53aebfe000]
      java.lang.Thread.State: WAITING (parking)
      at sun.misc.Unsafe.park(Native Method)
      parking to wait for <0x0000000498c249c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
      at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
      at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
      at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332)
      at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58)
      at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
      at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
      at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
      at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
      at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
      at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
      at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
      at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
      at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
      at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
      at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199)
      at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
      at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97)
      at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
      at org.apache.spark.scheduler.Task.run(Task.scala:114)
      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
      at java.lang.Thread.run(Thread.java:834)
      
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set(shuffle_5_1431_805, shuffle_5_1431_808, shuffle_5_1431_806, shuffle_5_1431_809, shuffle_5_1431_807)
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set(shuffle_5_1431_808, shuffle_5_1431_806, shuffle_5_1431_809, shuffle_5_1431_807)
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set(shuffle_5_1431_808, shuffle_5_1431_809, shuffle_5_1431_807)
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set(shuffle_5_1431_808, shuffle_5_1431_809)
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set(shuffle_5_1431_809)
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set()
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 21
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 20
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 19
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 18
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 17
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 16
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 15
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 14
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 13
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 12
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 11
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 10
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 9
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 8
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 7
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 6
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 5
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 4
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 3
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 2
      17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 1
      17/05/26 12:04:14 WARN TransportChannelHandler: Exception in connection from bigdata-apache-hdp-132.xg01/10.0.132.58:7337
      java.io.IOException: Connection reset by peer
      	at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
      	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
      	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
      	at sun.nio.ch.IOUtil.read(IOUtil.java:192)
      	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
      	at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
      	at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899)
      	at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275)
      	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
      	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
      	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
      	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
      	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
      	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
      	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
      

      Attachments

        1. executor_jstack
          52 kB
          cen yuhai
        2. executor_log
          212 kB
          cen yuhai
        3. screenshot-1.png
          33 kB
          cen yuhai
        4. screenshot-2.png
          213 kB
          cen yuhai
        5. screenshot-3.png
          11 kB
          cen yuhai

        Activity

          People

            Unassigned Unassigned
            cenyuhai cen yuhai
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: