Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3714

Query runs out of memory and remains in CANCELLATION_REQUESTED state until drillbit is restarted

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.7.0
    • Component/s: Execution - Flow
    • Labels:
      None

      Description

      This is a variation of DRILL-3705 with the difference of drill behavior when hitting OOM condition.

      Query runs out of memory during execution and remains in "CANCELLATION_REQUESTED" state until drillbit is bounced.
      Client (sqlline in this case) never gets a response from the server.

      Reproduction details:
      Single node drillbit installation.
      DRILL_MAX_DIRECT_MEMORY="8G"
      DRILL_HEAP="4G"

      Run this query on TPCDS SF100 data set

      SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS TotalSpend FROM store_sales ss WHERE ss.ss_store_sk IS NOT NULL ORDER BY 1 LIMIT 10;
      

      drillbit.log

      2015-08-26 16:54:58,469 [2a2210a7-7a78-c774-d54c-c863d0b77bb0:frag:3:22] INFO  o.a.d.e.w.f.FragmentStatusReporter - 2a2210a7-7a78-c774-d54c-c863d0b77bb0:3:22: State to report: RUNNING
      2015-08-26 16:55:50,498 [BitServer-5] WARN  o.a.drill.exec.rpc.data.DataServer - Message of mode REQUEST of rpc type 3 took longer than 500ms.  Actual duration was 2569ms.
      2015-08-26 16:56:31,086 [BitServer-5] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  Connection: /10.10.88.133:31012 <--> /10.10.88.133:54554 (data server).  Closing connection.
      io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct buffer memory
              at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233) ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618) [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
              at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
              at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
              at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) [netty-common-4.0.27.Final.jar:4.0.27.Final]
              at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
      Caused by: java.lang.OutOfMemoryError: Direct buffer memory
              at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_71]
              at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) ~[na:1.7.0_71]
              at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) ~[na:1.7.0_71]
              at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:437) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.buffer.PoolArena.reallocate(PoolArena.java:280) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:110) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.buffer.WrappedByteBuf.writeBytes(WrappedByteBuf.java:600) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.buffer.UnsafeDirectLittleEndian.writeBytes(UnsafeDirectLittleEndian.java:28) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:4.0.27.Final]
              at io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92) ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:227) ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
              ... 11 common frames omitted
      2015-08-26 16:56:31,087 [BitServer-5] INFO  o.a.d.exec.rpc.ProtobufLengthDecoder - Channel is closed, discarding remaining 124958 byte(s) in buffer.
      2015-08-26 16:56:31,087 [BitClient-1] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  Connection: /10.10.88.133:54554 <--> /10.10.88.133:31012 (data client).  Closing connection.
      java.io.IOException: syscall:read(...)() failed: Connection reset by peer
      2015-08-26 16:56:31,088 [BitClient-1] INFO  o.a.drill.exec.rpc.data.DataClient - Channel closed /10.10.88.133:54554 <--> /10.10.88.133:31012.
      2015-08-26 16:56:35,325 [BitServer-6] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  Connection: /10.10.88.133:31012 <--> /10.10.88.133:54555 (data server).  Closing connection.
      io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct buffer memory
              at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233) ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
              at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618) [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
              at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
              at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
              at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) [netty-common-4.0.27.Final.jar:4.0.27.Final]
              at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
      

      Attached:
      query_profile_2a2210a7-7a78-c774-d54c-c863d0b77bb0.json
      drillbit.log
      jstack.txt

        Attachments

        1. Screen Shot 2015-08-26 at 10.36.33 AM.png
          768 kB
          Victoria Markman
        2. jstack.txt
          681 kB
          Victoria Markman
        3. drillbit.log
          115 kB
          Victoria Markman
        4. query_profile_2a2210a7-7a78-c774-d54c-c863d0b77bb0.json
          115 kB
          Victoria Markman

          Issue Links

            Activity

              People

              • Assignee:
                jnadeau Jacques Nadeau
                Reporter:
                vicky Victoria Markman
                Reviewer:
                Khurram Faraaz
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: