Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2274

Unable to allocate sv2 buffer after repeated attempts : JOIN, Order by used in query

    XMLWordPrintableJSON

    Details

      Description

      git.commit.id.abbrev=6676f2d

      The below query fails :

      select sub1.uid from `data.json` sub1 inner join `data.json` sub2 on sub1.uid = sub2.uid order by sub1.uid;
      

      Error from the logs :

      2015-02-20 00:24:08,431 [2b1981b0-149e-981b-f83f-512c587321d7:frag:1:2] ERROR o.a.d.e.w.f.AbstractStatusReporter - Error 66dba4ff-644c-4400-ab84-203256dc2600: Failure while running fragment.
       java.lang.RuntimeException: org.apache.drill.exec.memory.OutOfMemoryException: Unable to allocate sv2 buffer after repeated attempts
       	at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:307) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:96) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:97) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:116) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71]
       	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71]
       	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
       Caused by: org.apache.drill.exec.memory.OutOfMemoryException: Unable to allocate sv2 buffer after repeated attempts
       	at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:516) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:305) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
       	... 16 common frames omitted
      

      On a different drillbit in the cluster, I found the below message for the same run

      2015-02-20 00:24:08,435 [BitServer-6] WARN  o.a.d.exec.rpc.control.WorkEventBus - A fragment message arrived but there was no registered listener for that message: profile {
         state: FAILED
         error {
           error_id: "66dba4ff-644c-4400-ab84-203256dc2600"
           endpoint {
             address: "qa-node191.qa.lab"
             user_port: 31010
             control_port: 31011
             data_port: 31012
           }
           message: "Failure while running fragment., Unable to allocate sv2 buffer after repeated attempts [ 66dba4ff-644c-4400-ab84-203256dc2600 on qa-node191.qa.lab:31010 ]\n"
         }
      

      I attached the data file which only has 2 records. I manually copied over the 2 records 50000 times and ran these queries on top of them.

      Let me know if you need anything else

        Attachments

        1. data.json
          4 kB
          Rahul Challapalli

          Issue Links

            Activity

              People

              • Assignee:
                rkins Rahul Challapalli
                Reporter:
                rkins Rahul Challapalli
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: