Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5470

Offset vector data corruption with CSV data

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.10.0
    • Fix Version/s: None
    • Component/s: Server
    • Labels:
      None
    • Environment:
      • ubuntu 14.04
      • r3.8xl (32 CPU/240GB Mem)
      • openjdk version "1.8.0_111"
      • drill 1.10.0 with 8656c83b00f8ab09fb6817e4e9943b2211772541 cherry-picked

      Description

      Per the mailing list discussion and Rahul's and Paul's suggestion I'm filing this Jira issue. Drill seems to be running out of memory when doing an External Sort. Per Zelaine's suggestion I enabled sort.external.disable_managed in drill-override.conf and in the sqlline session. This caused the query to run for longer but it still would fail with the same message.

      Per Paul's suggestion, I enabled debug logging for the org.apache.drill.exec.physical.impl.xsort.managed package and re-ran the query.

      Here's the initial DEBUG line for ExternalSortBatch for our query:

      2017-05-03 12:02:56,095 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:15] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Config: memory limit = 10737418240, spill file size = 268435456, spill batch size = 8388608, merge limit = 2147483647, merge batch size = 16777216

      And here's the last DEBUG line before the stack trace:

      2017-05-03 12:37:44,249 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:4] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Available memory: 10737418240, buffer memory = 10719535268, merge memory = 10707140978

      And the stacktrace:

      2017-05-03 12:38:02,927 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:6] INFO o.a.d.e.p.i.x.m.ExternalSortBatch - User Error Occurred: External Sort encountered an error while spilling to disk (Un
      able to allocate buffer of size 268435456 due to memory limit. Current allocation: 10579849472)
      org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: External Sort encountered an error while spilling to disk

      [Error Id: 5d53c677-0cd9-4c01-a664-c02089670a1c ]
      at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544) ~[drill-common-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1447) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:1376) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.spillFromMemory(ExternalSortBatch.java:1339) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.processBatch(ExternalSortBatch.java:831) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch(ExternalSortBatch.java:618) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:660) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext(ExternalSortBatch.java:559) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext(StreamingAggBatch.java:137) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:144) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:232) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:226) [drill-java-exec-1.10.0.jar:1.10.0]
      at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_111]
      at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_111]
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) [hadoop-common-2.7.1.jar:na]
      at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:226) [drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.10.0.jar:1.10.0]
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_111]
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111]
      at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
      Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate buffer of size 268435456 due to memory limit. Current allocation: 10579849472
      at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220) ~[drill-memory-base-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:195) ~[drill-memory-base-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.vector.VarCharVector.reAlloc(VarCharVector.java:425) ~[vector-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.vector.VarCharVector.copyFromSafe(VarCharVector.java:278) ~[vector-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.vector.NullableVarCharVector.copyFromSafe(NullableVarCharVector.java:379) ~[vector-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.test.generated.PriorityQueueCopierGen140.doCopy(PriorityQueueCopierTemplate.java:22) ~[na:na]
      at org.apache.drill.exec.test.generated.PriorityQueueCopierGen140.next(PriorityQueueCopierTemplate.java:76) ~[na:na]
      at org.apache.drill.exec.physical.impl.xsort.managed.CopierHolder$BatchMerger.next(CopierHolder.java:234) ~[drill-java-exec-1.10.0.jar:1.10.0]
      at org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1408) [drill-java-exec-1.10.0.jar:1.10.0]
      ... 24 common frames omitted

      I'm in communication with Paul and will send him the full log file.

      Thanks,
      Nathan

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                paul-rogers Paul Rogers
                Reporter:
                butlern Nathan Butler
                Reviewer:
                Rahul Kumar Challapalli
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: