Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3926

TPCH Concurrency Scale tests hit ChannelClosedException

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.15.0
    • Component/s: Functions - Drill
    • Labels:
      None
    • Environment:

      ucs-node 1 - node 11 (10+1 node cluster), RHEL 6.4 Linux 2.6.32-358.el6.x86_64, MapR 4.0.2.29870.GA, MapR Drill 1.2 gitID eafe0a2

      Description

      In TPCH Concurrency tests, we try to see how drill scales up with number of threads with each threads running a simple query (tpch query #6). With 96 threads, many threads terminated due to ChannelClosedException and/or FormanException:
      2015-10-07 18:01:26 [pip87] ERROR PipSQuawkling executeQuery - [ 0 / 06_par100 ] SYSTEM ERROR: ChannelClosedException

      [Error Id: cbae3879-8067-47cd-8c42-91a38896b81a on ucs-node9.perf.lab:31010]
      java.sql.SQLException: SYSTEM ERROR: ChannelClosedException

      [Error Id: cbae3879-8067-47cd-8c42-91a38896b81a on ucs-node9.perf.lab:31010]
      at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
      at org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
      at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1359)
      at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:74)
      at net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
      at net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
      at net.hydromatic.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:78)
      at org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:97)
      at PipSQuawkling.executeQuery(PipSQuawkling.java:295)
      at PipSQuawkling.executeTest(PipSQuawkling.java:148)
      at PipSQuawkling.run(PipSQuawkling.java:76)
      Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: ChannelClosedException

      [Error Id: cbae3879-8067-47cd-8c42-91a38896b81a on ucs-node9.perf.lab:31010]
      at org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
      at org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:110)
      at org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
      at org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
      at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61)
      at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233)
      at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205)
      at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
      at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
      at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
      at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
      at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
      at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
      at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
      at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
      at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
      at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
      at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
      at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
      at java.lang.Thread.run(Thread.java:744)

      2015-10-07 18:18:12 [pip10] ERROR PipSQuawkling fetchRows - [ 1 / 06_par100 ] SYSTEM ERROR: ForemanException: One more more nodes lost connectivity during query. Identified nodes were [ucs-node4.perf.lab:31010].

      [Error Id: 443e598e-8511-40be-a5f6-9e1c0614a33b on ucs-node9.perf.lab:31010]
      java.sql.SQLException: SYSTEM ERROR: ForemanException: One more more nodes lost connectivity during query. Identified nodes were [ucs-node4.perf.lab:31010].

      [Error Id: 443e598e-8511-40be-a5f6-9e1c0614a33b on ucs-node9.perf.lab:31010]
      at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
      at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:320)
      at net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
      at org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:160)
      at PipSQuawkling.fetchRows(PipSQuawkling.java:330)
      at PipSQuawkling.executeTest(PipSQuawkling.java:158)
      at PipSQuawkling.run(PipSQuawkling.java:76)
      Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: ForemanException: One more more nodes lost connectivity during query. Identified nodes were [ucs-node4.perf.lab:31010].

      [Error Id: 443e598e-8511-40be-a5f6-9e1c0614a33b on ucs-node9.perf.lab:31010]
      at org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
      at org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:110)
      at org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
      at org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
      at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61)
      at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233)
      at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205)
      at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
      at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
      at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
      at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
      at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
      at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
      at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
      at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
      at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
      at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
      at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
      at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
      at java.lang.Thread.run(Thread.java:744)

      All the drillbits were still alive when the exception was hit.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dechanggu Dechang Gu
                Reporter:
                dechanggu Dechang Gu
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: