Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2418

Memory leak during execution if comparison function is not found

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 1.2.0
    • Component/s: Execution - Flow
    • Labels:
      None

      Description

      While testing implicit cast during join, I ran into an issue where if you run a query that throws an exception during execution, eventually, if you run enough of those, drill will run out of memory.

      Here is a query example:

      select count(*) from cast_tbl_1 a, cast_tbl_2 b where a.c_float = b.c_time
       failed: RemoteRpcException: Failure while running fragment., Failure finding function that runtime code generation expected.  Signature: compare_to_nulls_high( TIME:OPTIONAL, FLOAT4:OPTIONAL ) returns INT:REQUIRED [ 633c8ce3-1ed2-4a0a-8248-1e3d5b4f7c0a on atsqa4-133.qa.lab:31010 ]
      [ 633c8ce3-1ed2-4a0a-8248-1e3d5b4f7c0a on atsqa4-133.qa.lab:31010 ]
      Test_Failed: 2015/03/10 18:34:15.0015 - Failed to execute.
      

      If you set planner.slice_target to 1, you hit out of memory after about ~40 or so of such failures on my cluster.

      select count(*) from cast_tbl_1 a, cast_tbl_2 b where a.d38 = b.c_double
      Query failed: OutOfMemoryException: You attempted to create a new child allocator with initial reservation 3000000 but only 916199 bytes of memory were available.
      

      From the drillbit.log

      2015-03-10 18:34:34,588 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  o.a.d.e.store.parquet.FooterGatherer - Fetch Parquet Footers: Executed 1 out of 1 using 1 threads. Time: 1ms total, 1.190007ms avg, 1ms max.
      2015-03-10 18:34:34,591 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  o.a.d.e.store.parquet.FooterGatherer - Fetch Parquet Footers: Executed 1 out of 1 using 1 threads. Time: 0ms total, 0.953679ms avg, 0ms max.
      2015-03-10 18:34:34,627 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host atsqa4-136.qa.lab.  Skipping affinity to that host.
      2015-03-10 18:34:34,627 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  o.a.d.e.s.parquet.ParquetGroupScan - Load Parquet RowGroup block maps: Executed 1 out of 1 using 1 threads. Time: 1ms total, 1.609586ms avg, 1ms max.
      2015-03-10 18:34:34,629 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host atsqa4-136.qa.lab.  Skipping affinity to that host.
      2015-03-10 18:34:34,629 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  o.a.d.e.s.parquet.ParquetGroupScan - Load Parquet RowGroup block maps: Executed 1 out of 1 using 1 threads. Time: 1ms total, 1.270340ms avg, 1ms max.
      2015-03-10 18:34:34,684 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> FAILED
      org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: Failure while getting memory allocator for fragment.
              at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:195) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71]
              at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
      Caused by: org.apache.drill.common.exceptions.ExecutionSetupException: Failure while getting memory allocator for fragment.
              at org.apache.drill.exec.ops.FragmentContext.<init>(FragmentContext.java:119) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.setupRootFragment(Foreman.java:535) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:307) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:511) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:186) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              ... 4 common frames omitted
      Caused by: org.apache.drill.exec.memory.OutOfMemoryException: You attempted to create a new child allocator with initial reservation 3000000 but only 916199 bytes of memory were available.
              at org.apache.drill.exec.memory.TopLevelAllocator.getChildAllocator(TopLevelAllocator.java:121) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.ops.FragmentContext.<init>(FragmentContext.java:116) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              ... 8 common frames omitted
      2015-03-10 18:34:34,700 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - Error 96a7baf4-f17a-454c-831b-f3dc77bd4381: OutOfMemoryException: You attempted to create a new child allocator with initial reservation 3000000 but only 916199 bytes of memory were available.
      org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: Failure while getting memory allocator for fragment.
              at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:195) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71]
              at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
      Caused by: org.apache.drill.common.exceptions.ExecutionSetupException: Failure while getting memory allocator for fragment.
              at org.apache.drill.exec.ops.FragmentContext.<init>(FragmentContext.java:119) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.setupRootFragment(Foreman.java:535) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:307) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:511) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:186) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              ... 4 common frames omitted
      Caused by: org.apache.drill.exec.memory.OutOfMemoryException: You attempted to create a new child allocator with initial reservation 3000000 but only 916199 bytes of memory were available.
              at org.apache.drill.exec.memory.TopLevelAllocator.getChildAllocator(TopLevelAllocator.java:121) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.ops.FragmentContext.<init>(FragmentContext.java:116) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              ... 8 common frames omitted
      2015-03-10 18:34:34,700 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - foreman cleaning up - status: [0=>[0=>FragmentData [isLocal=true, status=profile {
      

      I will attach reproduction and I have to add that I have no proof that error is actually causing memory leak (speculation on my part).

        Attachments

        1. not_supported_cast.txt
          7 kB
          Victoria Markman
        2. cast_tbl_2.parquet
          3 kB
          Victoria Markman
        3. cast_tbl_1.parquet
          3 kB
          Victoria Markman

          Activity

            People

            • Assignee:
              cwestin Chris Westin
              Reporter:
              vicky Victoria Markman
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: