Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6530

JVM crash with a query involving multiple json files with one file having a schema change of one column from string to list

    XMLWordPrintableJSON

Details

    Description

      JVM crash with a Lateral Unnest query involving multiple json files with one file having a schema change of one column from string to list .

      Query :-

      SELECT customer.c_custkey,customer.c_acctbal,orders.o_orderkey, orders.o_totalprice,orders.o_orderdate,orders.o_shippriority,customer.c_address,orders.o_orderpriority,customer.c_comment
      FROM customer, LATERAL 
      (SELECT O.ord.o_orderkey as o_orderkey, O.ord.o_totalprice as o_totalprice,O.ord.o_orderdate as o_orderdate ,O.ord.o_shippriority as o_shippriority,O.ord.o_orderpriority 
      as o_orderpriority FROM UNNEST(customer.c_orders) O(ord))orders;
      

      The error got was

      o.a.d.e.p.impl.join.LateralJoinBatch - Output batch still has some space left, getting new batches from left and right
      2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_custkey
      2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_phone
      2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_acctbal
      2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_orders
      2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_mktsegment
      2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_address
      2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_nationkey
      2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_name
      2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_comment
      2018-06-21 15:25:16,316 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.d.e.v.c.AbstractContainerVector - Field [o_comment] mutated from [NullableVarCharVector] to [RepeatedVarCharVector]
      2018-06-21 15:25:16,318 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG o.a.drill.exec.vector.UInt4Vector - Reallocating vector [[`$offsets$` (UINT4:REQUIRED)]]. # of bytes: [16384] -> [32768]
      

      On Further investigating with shamirwasia it's found that the crash only happens when [o_comment] mutates from  [NullableVarCharVector]  to [RepeatedVarCharVector],not the other way around

      Please find the logs stack trace and the data file

       

      Attachments

        1. drillbit.out
          7 kB
          Kedar Sankar Behera
        2. hs_err_pid32076.log
          405 kB
          Kedar Sankar Behera
        3. 0_0_92.json
          4.91 MB
          Kedar Sankar Behera
        4. 0_0_93.json
          5.10 MB
          Kedar Sankar Behera
        5. drillbit.log
          47.62 MB
          Kedar Sankar Behera

        Issue Links

          Activity

            People

              shamirwasia Sorabh Hamirwasia
              kedar.behera Kedar Sankar Behera
              Parth Chandra Parth Chandra
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: