Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5804

External Sort times out, may be infinite loop

    XMLWordPrintableJSON

Details

    Description

      Query is:

      ALTER SESSION SET `exec.sort.disable_managed` = false;
      select count(*) from (
        select * from (
          select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid 
          from (
            select d.type type, d.uid uid, flatten(d.map.rm) rms from dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid
          ) s1
        ) s2
        order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist
      );
      

      Plan is:

      | 00-00    Screen
      00-01      Project(EXPR$0=[$0])
      00-02        StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
      00-03          UnionExchange
      01-01            StreamAgg(group=[{}], EXPR$0=[COUNT()])
      01-02              Project($f0=[0])
      01-03                SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], sort2=[6 ASC])
      02-01                  SelectionVectorRemover
      02-02                    Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], dir1=[ASC], dir2=[ASC])
      02-03                      Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6])
      02-04                        HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], dist2=[[$6]])
      03-01                          UnorderedMuxExchange
      04-01                            Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, hash32AsDouble($4, 1301011)))])
      04-02                              Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], EXPR$6=[ITEM($1, 'do_not_exist')])
      04-03                                Flatten(flattenField=[$1])
      04-04                                  Project(type=[$0], rptds=[ITEM($2, 'rptd')], rms=[$2], uid=[$1])
      04-05                                    SingleMergeExchange(sort0=[1 ASC])
      05-01                                      SelectionVectorRemover
      05-02                                        Sort(sort0=[$1], dir0=[ASC])
      05-03                                          Project(type=[$0], uid=[$1], rms=[$2])
      05-04                                            HashToRandomExchange(dist0=[[$1]])
      06-01                                              UnorderedMuxExchange
      07-01                                                Project(type=[$0], uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)])
      07-02                                                  Flatten(flattenField=[$2])
      07-03                                                    Project(type=[$0], uid=[$1], rms=[ITEM($2, 'rm')])
      07-04                                                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/resource-manager/nested_large]], selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, numFiles=1, usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]])
      

      Here is a segment of the drillbit.log, starting at line 55890:

      2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records
      2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records
      2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record batch with status OK
      2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records
      2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record batch with status OK
      2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 143 us to sort 1023 records
      2017-09-19 04:22:56,361 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG o.a.d.exec.compile.ClassTransformer - Compiled and merged PriorityQueueCopierGen50: bytecode size = 11.0 KiB, time = 124 ms.
      2017-09-19 04:22:56,365 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 108 us to sort 1023 records
      2017-09-19 04:22:56,367 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG o.a.d.e.p.i.x.m.PriorityQueueCopierWrapper - Copier setup complete
      2017-09-19 04:22:56,375 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 144 us to sort 1023 records
      2017-09-19 04:22:56,396 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG o.a.drill.exec.vector.BigIntVector - Reallocating vector [$data$(BIGINT:REQUIRED)]. # of bytes: [0] -> [0]
      2017-09-19 04:22:56,396 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG o.a.drill.exec.vector.BigIntVector - Reallocating vector [$data$(BIGINT:REQUIRED)]. # of bytes: [0] -> [0]
      2017-09-19 04:22:56,396 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG o.a.drill.exec.vector.BigIntVector - Reallocating vector [$data$(BIGINT:REQUIRED)]. # of bytes: [0] -> [0]
      

      From here on out, the drillbit.log file is filled with "Reallocating vector" messages. The log seems to be written as fast as Drill can write to disk. The logs are filled within a minute.

      sing cln2 q40.q

      Attachments

        1. drillbit.log
          12.30 MB
          Robert Hou

        Issue Links

          Activity

            People

              Paul.Rogers Paul Rogers
              rhou Robert Hou
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: