Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6566

Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.

    Details

      Description

      This is TPCDS Query 66.

      Query: tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql

      SELECT w_warehouse_name,
      w_warehouse_sq_ft,
      w_city,
      w_county,
      w_state,
      w_country,
      ship_carriers,
      year1,
      Sum(jan_sales) AS jan_sales,
      Sum(feb_sales) AS feb_sales,
      Sum(mar_sales) AS mar_sales,
      Sum(apr_sales) AS apr_sales,
      Sum(may_sales) AS may_sales,
      Sum(jun_sales) AS jun_sales,
      Sum(jul_sales) AS jul_sales,
      Sum(aug_sales) AS aug_sales,
      Sum(sep_sales) AS sep_sales,
      Sum(oct_sales) AS oct_sales,
      Sum(nov_sales) AS nov_sales,
      Sum(dec_sales) AS dec_sales,
      Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot,
      Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot,
      Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot,
      Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot,
      Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot,
      Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot,
      Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot,
      Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot,
      Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot,
      Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot,
      Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot,
      Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot,
      Sum(jan_net) AS jan_net,
      Sum(feb_net) AS feb_net,
      Sum(mar_net) AS mar_net,
      Sum(apr_net) AS apr_net,
      Sum(may_net) AS may_net,
      Sum(jun_net) AS jun_net,
      Sum(jul_net) AS jul_net,
      Sum(aug_net) AS aug_net,
      Sum(sep_net) AS sep_net,
      Sum(oct_net) AS oct_net,
      Sum(nov_net) AS nov_net,
      Sum(dec_net) AS dec_net
      FROM (SELECT w_warehouse_name,
      w_warehouse_sq_ft,
      w_city,
      w_county,
      w_state,
      w_country,
      'ZOUROS'
      || ','
      || 'ZHOU' AS ship_carriers,
      d_year AS year1,
      Sum(CASE
      WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS jan_sales,
      Sum(CASE
      WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS feb_sales,
      Sum(CASE
      WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS mar_sales,
      Sum(CASE
      WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS apr_sales,
      Sum(CASE
      WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS may_sales,
      Sum(CASE
      WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS jun_sales,
      Sum(CASE
      WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS jul_sales,
      Sum(CASE
      WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS aug_sales,
      Sum(CASE
      WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS sep_sales,
      Sum(CASE
      WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS oct_sales,
      Sum(CASE
      WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS nov_sales,
      Sum(CASE
      WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity
      ELSE 0
      END) AS dec_sales,
      Sum(CASE
      WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS jan_net,
      Sum(CASE
      WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS feb_net,
      Sum(CASE
      WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS mar_net,
      Sum(CASE
      WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS apr_net,
      Sum(CASE
      WHEN d_moy = 5 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS may_net,
      Sum(CASE
      WHEN d_moy = 6 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS jun_net,
      Sum(CASE
      WHEN d_moy = 7 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS jul_net,
      Sum(CASE
      WHEN d_moy = 8 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS aug_net,
      Sum(CASE
      WHEN d_moy = 9 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS sep_net,
      Sum(CASE
      WHEN d_moy = 10 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS oct_net,
      Sum(CASE
      WHEN d_moy = 11 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS nov_net,
      Sum(CASE
      WHEN d_moy = 12 THEN ws_net_paid_inc_ship * ws_quantity
      ELSE 0
      END) AS dec_net
      FROM web_sales,
      warehouse,
      date_dim,
      time_dim,
      ship_mode
      WHERE ws_warehouse_sk = w_warehouse_sk
      AND ws_sold_date_sk = d_date_sk
      AND ws_sold_time_sk = t_time_sk
      AND ws_ship_mode_sk = sm_ship_mode_sk
      AND d_year = 1998
      AND t_time BETWEEN 7249 AND 7249 + 28800
      AND sm_carrier IN ( 'ZOUROS', 'ZHOU' )
      GROUP BY w_warehouse_name,
      w_warehouse_sq_ft,
      w_city,
      w_county,
      w_state,
      w_country,
      d_year
      UNION ALL
      SELECT w_warehouse_name,
      w_warehouse_sq_ft,
      w_city,
      w_county,
      w_state,
      w_country,
      'ZOUROS'
      || ','
      || 'ZHOU' AS ship_carriers,
      d_year AS year1,
      Sum(CASE
      WHEN d_moy = 1 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS jan_sales,
      Sum(CASE
      WHEN d_moy = 2 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS feb_sales,
      Sum(CASE
      WHEN d_moy = 3 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS mar_sales,
      Sum(CASE
      WHEN d_moy = 4 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS apr_sales,
      Sum(CASE
      WHEN d_moy = 5 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS may_sales,
      Sum(CASE
      WHEN d_moy = 6 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS jun_sales,
      Sum(CASE
      WHEN d_moy = 7 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS jul_sales,
      Sum(CASE
      WHEN d_moy = 8 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS aug_sales,
      Sum(CASE
      WHEN d_moy = 9 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS sep_sales,
      Sum(CASE
      WHEN d_moy = 10 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS oct_sales,
      Sum(CASE
      WHEN d_moy = 11 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS nov_sales,
      Sum(CASE
      WHEN d_moy = 12 THEN cs_ext_sales_price * cs_quantity
      ELSE 0
      END) AS dec_sales,
      Sum(CASE
      WHEN d_moy = 1 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS jan_net,
      Sum(CASE
      WHEN d_moy = 2 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS feb_net,
      Sum(CASE
      WHEN d_moy = 3 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS mar_net,
      Sum(CASE
      WHEN d_moy = 4 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS apr_net,
      Sum(CASE
      WHEN d_moy = 5 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS may_net,
      Sum(CASE
      WHEN d_moy = 6 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS jun_net,
      Sum(CASE
      WHEN d_moy = 7 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS jul_net,
      Sum(CASE
      WHEN d_moy = 8 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS aug_net,
      Sum(CASE
      WHEN d_moy = 9 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS sep_net,
      Sum(CASE
      WHEN d_moy = 10 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS oct_net,
      Sum(CASE
      WHEN d_moy = 11 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS nov_net,
      Sum(CASE
      WHEN d_moy = 12 THEN cs_net_paid * cs_quantity
      ELSE 0
      END) AS dec_net
      FROM catalog_sales,
      warehouse,
      date_dim,
      time_dim,
      ship_mode
      WHERE cs_warehouse_sk = w_warehouse_sk
      AND cs_sold_date_sk = d_date_sk
      AND cs_sold_time_sk = t_time_sk
      AND cs_ship_mode_sk = sm_ship_mode_sk
      AND d_year = 1998
      AND t_time BETWEEN 7249 AND 7249 + 28800
      AND sm_carrier IN ( 'ZOUROS', 'ZHOU' )
      GROUP BY w_warehouse_name,
      w_warehouse_sq_ft,
      w_city,
      w_county,
      w_state,
      w_country,
      d_year) x
      GROUP BY w_warehouse_name,
      w_warehouse_sq_ft,
      w_city,
      w_county,
      w_state,
      w_country,
      ship_carriers,
      year1
      ORDER BY w_warehouse_name
      LIMIT 100;

      Here is the error message with stack trace:

      java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.

      AGGR OOM at First Phase. Partitions: 1. Estimated batch size: 31260672. values size: 25165824. Output alloc size: 25165824. Planned batches: 1 Memory limit: 2302755 so far allocated: 262144.
      Fragment 6:1

      [Error Id: 660fd3f9-bb90-4245-b63c-208fc37c0199 on atsqa6c82.qa.lab:31010]

      (org.apache.drill.exec.exception.OutOfMemoryException) AGGR OOM at First Phase. Partitions: 1. Estimated batch size: 31260672. values size: 25165824. Output alloc size: 25165824. Planned batches: 1 Memory limit: 2302755 so far allocated: 262144.
      org.apache.drill.exec.test.generated.HashAggregatorGen15454.spillIfNeeded():1468
      org.apache.drill.exec.test.generated.HashAggregatorGen15454.doSpill():1430
      org.apache.drill.exec.test.generated.HashAggregatorGen15454.checkGroupAndAggrValues():1353
      org.apache.drill.exec.test.generated.HashAggregatorGen15454.doWork():587
      org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():184
      org.apache.drill.exec.record.AbstractRecordBatch.next():172
      org.apache.drill.exec.record.AbstractRecordBatch.next():119
      org.apache.drill.exec.record.AbstractRecordBatch.next():109
      org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
      org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147
      org.apache.drill.exec.record.AbstractRecordBatch.next():172
      org.apache.drill.exec.physical.impl.BaseRootExec.next():103
      org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
      org.apache.drill.exec.physical.impl.BaseRootExec.next():93
      org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294
      org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281
      java.security.AccessController.doPrivileged():-2
      javax.security.auth.Subject.doAs():422
      org.apache.hadoop.security.UserGroupInformation.doAs():1595
      org.apache.drill.exec.work.fragment.FragmentExecutor.run():281
      org.apache.drill.common.SelfCleaningRunnable.run():38
      java.util.concurrent.ThreadPoolExecutor.runWorker():1149
      java.util.concurrent.ThreadPoolExecutor$Worker.run():624
      java.lang.Thread.run():748

      at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528)
      at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:632)
      at oadd.org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:207)
      at org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:153)
      at org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:253)
      at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.

      AGGR OOM at First Phase. Partitions: 1. Estimated batch size: 31260672. values size: 25165824. Output alloc size: 25165824. Planned batches: 1 Memory limit: 2302755 so far allocated: 262144.
      Fragment 6:1

      [Error Id: 660fd3f9-bb90-4245-b63c-208fc37c0199 on atsqa6c82.qa.lab:31010]

      (org.apache.drill.exec.exception.OutOfMemoryException) AGGR OOM at First Phase. Partitions: 1. Estimated batch size: 31260672. values size: 25165824. Output alloc size: 25165824. Planned batches: 1 Memory limit: 2302755 so far allocated: 262144.
      org.apache.drill.exec.test.generated.HashAggregatorGen15454.spillIfNeeded():1468
      org.apache.drill.exec.test.generated.HashAggregatorGen15454.doSpill():1430
      org.apache.drill.exec.test.generated.HashAggregatorGen15454.checkGroupAndAggrValues():1353
      org.apache.drill.exec.test.generated.HashAggregatorGen15454.doWork():587
      org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():184
      org.apache.drill.exec.record.AbstractRecordBatch.next():172
      org.apache.drill.exec.record.AbstractRecordBatch.next():119
      org.apache.drill.exec.record.AbstractRecordBatch.next():109
      org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
      org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147
      org.apache.drill.exec.record.AbstractRecordBatch.next():172
      org.apache.drill.exec.physical.impl.BaseRootExec.next():103
      org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
      org.apache.drill.exec.physical.impl.BaseRootExec.next():93
      org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294
      org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281
      java.security.AccessController.doPrivileged():-2
      javax.security.auth.Subject.doAs():422
      org.apache.hadoop.security.UserGroupInformation.doAs():1595
      org.apache.drill.exec.work.fragment.FragmentExecutor.run():281
      org.apache.drill.common.SelfCleaningRunnable.run():38
      java.util.concurrent.ThreadPoolExecutor.runWorker():1149
      java.util.concurrent.ThreadPoolExecutor$Worker.run():624
      java.lang.Thread.run():748

      at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
      at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
      at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
      at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274)
      at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:244)
      at oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
      at oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
      at oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
      at oadd.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
      at oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
      at oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
      at oadd.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      at oadd.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
      at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
      at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
      at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
      at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
      at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
      at oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
      ... 1 more

      Here is the commit id:
      1.14.0-SNAPSHOT 140d09e69b65ac2cb1bed09a37fa5861d39a99b3 DRILL-6539: Record count not set for this vector container error 28.06.2018 @ 16:13:20 PDT Unknown 28.06.2018 @ 16:21:42 PDT

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ben-zvi Boaz Ben-Zvi
                Reporter:
                rhou Robert Hou
                Reviewer:
                Timothy Farkas
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: