Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6329

TPC-DS Query 66 failed due to OOM

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 1.14.0
    • None
    • Execution - Flow
    • None

    Description

      TPC-DS Query 66 failed after 27 minutes on Drill 1.14.0 on a 4 node cluster against SF1 parquet data (dfs.tpcds_sf1_parquet_views). Query 66 and the query profile and the query plan are attached here.

      This seems to be a regression, the same query worked fine on 1.10.0

      On Drill 1.10.0 ( git.commit id : bbcf4b76) => 9.026 seconds (completed successfully).
      On Drill 1.14.0 ( git.commit.id.abbrev=da24113 ) query 66 failed after running for 27 minutes, due to OutOfMemoryException

      Stack trace from sqlline console, no stack trace was written to drillbit.log

      Error: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
      
      Too little memory available
      Fragment 2:0
      
      [Error Id: 5636a939-a318-4b59-b3e8-9eb93f6b82f3 on qa102-45.qa.lab:31010]
      
      (org.apache.drill.exec.exception.OutOfMemoryException) Too little memory available
       org.apache.drill.exec.test.generated.HashAggregatorGen7120.delayedSetup():409
       org.apache.drill.exec.test.generated.HashAggregatorGen7120.doWork():579
       org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
       org.apache.drill.exec.record.AbstractRecordBatch.next():164
       org.apache.drill.exec.record.AbstractRecordBatch.next():119
       org.apache.drill.exec.record.AbstractRecordBatch.next():109
       org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
       org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134
       org.apache.drill.exec.record.AbstractRecordBatch.next():164
       org.apache.drill.exec.physical.impl.BaseRootExec.next():105
       org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
       org.apache.drill.exec.physical.impl.BaseRootExec.next():95
       org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():292
       org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():279
       java.security.AccessController.doPrivileged():-2
       javax.security.auth.Subject.doAs():422
       org.apache.hadoop.security.UserGroupInformation.doAs():1595
       org.apache.drill.exec.work.fragment.FragmentExecutor.run():279
       org.apache.drill.common.SelfCleaningRunnable.run():38
       java.util.concurrent.ThreadPoolExecutor.runWorker():1149
       java.util.concurrent.ThreadPoolExecutor$Worker.run():624
       java.lang.Thread.run():748 (state=,code=0)
      java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
      
      Too little memory available
      Fragment 2:0
      
      [Error Id: 5636a939-a318-4b59-b3e8-9eb93f6b82f3 on qa102-45.qa.lab:31010]
      
      (org.apache.drill.exec.exception.OutOfMemoryException) Too little memory available
       org.apache.drill.exec.test.generated.HashAggregatorGen7120.delayedSetup():409
       org.apache.drill.exec.test.generated.HashAggregatorGen7120.doWork():579
       org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
       org.apache.drill.exec.record.AbstractRecordBatch.next():164
       org.apache.drill.exec.record.AbstractRecordBatch.next():119
       org.apache.drill.exec.record.AbstractRecordBatch.next():109
       org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
       org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134
       org.apache.drill.exec.record.AbstractRecordBatch.next():164
       org.apache.drill.exec.physical.impl.BaseRootExec.next():105
       org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
       org.apache.drill.exec.physical.impl.BaseRootExec.next():95
       org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():292
       org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():279
       java.security.AccessController.doPrivileged():-2
       javax.security.auth.Subject.doAs():422
       org.apache.hadoop.security.UserGroupInformation.doAs():1595
       org.apache.drill.exec.work.fragment.FragmentExecutor.run():279
       org.apache.drill.common.SelfCleaningRunnable.run():38
       java.util.concurrent.ThreadPoolExecutor.runWorker():1149
       java.util.concurrent.ThreadPoolExecutor$Worker.run():624
       java.lang.Thread.run():748
       
      ...
      Caused by: org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
      
      Too little memory available
      Fragment 2:0
      
      [Error Id: 5636a939-a318-4b59-b3e8-9eb93f6b82f3 on qa102-45.qa.lab:31010]
      
      (org.apache.drill.exec.exception.OutOfMemoryException) Too little memory available
       org.apache.drill.exec.test.generated.HashAggregatorGen7120.delayedSetup():409
       org.apache.drill.exec.test.generated.HashAggregatorGen7120.doWork():579
       org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
       org.apache.drill.exec.record.AbstractRecordBatch.next():164
       org.apache.drill.exec.record.AbstractRecordBatch.next():119
       org.apache.drill.exec.record.AbstractRecordBatch.next():109
       org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
       org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134
       org.apache.drill.exec.record.AbstractRecordBatch.next():164
       org.apache.drill.exec.physical.impl.BaseRootExec.next():105
       org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
       org.apache.drill.exec.physical.impl.BaseRootExec.next():95
       org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():292
       org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():279
       java.security.AccessController.doPrivileged():-2
       javax.security.auth.Subject.doAs():422
       org.apache.hadoop.security.UserGroupInformation.doAs():1595
       org.apache.drill.exec.work.fragment.FragmentExecutor.run():279
       org.apache.drill.common.SelfCleaningRunnable.run():38
       java.util.concurrent.ThreadPoolExecutor.runWorker():1149
       java.util.concurrent.ThreadPoolExecutor$Worker.run():624
       java.lang.Thread.run():748
       
      

      Attachments

        1. 252f0f20-2774-43d7-ec31-911ee0f5f330.sys.drill
          100 kB
          Khurram Faraaz
        2. TPCDS_Query_66_PLAN.txt
          59 kB
          Khurram Faraaz
        3. TPCDS_Query_66.sql
          8 kB
          Khurram Faraaz

        Activity

          People

            ben-zvi Boaz Ben-Zvi
            khfaraaz Khurram Faraaz
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: