Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11280

Zipping unnest hits DCHECK when querying from a view that has an IN operator

    XMLWordPrintableJSON

Details

    Description

      Repro steps:

      1) Create a view that returns arrays and has an IN operator in the WHERE clause:

      drop view if exists unnest_bug_view;
      create view unnest_bug_view as (
        select id, arr1, arr2
        from functional_parquet.complextypes_arrays
        where id % 2 = 1 and id in (select id from functional_parquet.alltypestiny)
      ); 

      2) Unnest the arrays and filter by the unnested values in an outer SELECT:

      select
        id,
        unnested_arr1,
        unnested_arr2
      from
        (select
           id,
           unnest(arr1) as unnested_arr1,
           unnest(arr2) as unnested_arr2
         from unnest_bug_view) a
      where a.unnested_arr1 < 5; 

      This hits a DCHECK in RowDescriptor::GetTupleIdx()

       

       

      descriptors.cc:467] 5643fd6cdd5cece3:77942ead00000000] Check failed: id < tuple_idx_map_.size() (3 vs. 2) RowDescriptor: Tuple(id=0 size=29 slots=[Slot(id=2 type=INT col_path=[0] offset=24 null=(offset=28 mask=4) slot_idx=2 field_idx=2), Slot(id=3 type=ARRAY col_path=[1] children_tuple_id=3 offset=0 null=(offset=28 mask=1) slot_idx=0 field_idx=0), Slot(id=5 type=ARRAY col_path=[2] children_tuple_id=4 offset=12 null=(offset=28 mask=2) slot_idx=1 field_idx=1)] tuple_path=[])
      Tuple(id=1 size=5 slots=[Slot(id=0 type=INT col_path=[2] offset=0 null=(offset=4 mask=1) slot_idx=0 field_idx=0)] tuple_path=[])
      *** Check failure stack trace: ***
          @          0x36fe72c  google::LogMessage::Fail()
          @          0x36fffdc  google::LogMessage::SendToLog()
          @          0x36fe08a  google::LogMessage::Flush()
          @          0x3701c48  google::LogMessageFatal::~LogMessageFatal()
          @          0x12e47ab  impala::RowDescriptor::GetTupleIdx()
          @          0x1b378f5  impala::SlotRef::Init()
          @          0x1b25fea  impala::ScalarExpr::Init()
          @          0x1b665b2  impala::ScalarFnCall::Init()
          @          0x1b2c44e  impala::ScalarExpr::Create()
          @          0x1b2c5df  impala::ScalarExpr::Create()
          @          0x1b2c6a0  impala::ScalarExpr::Create()
          @          0x19ad286  impala::PartitionedHashJoinPlanNode::Init()
          @          0x18b5d8d  impala::PlanNode::CreateTreeHelper()
          @          0x18b5cd9  impala::PlanNode::CreateTreeHelper()
          @          0x18b5e48  impala::PlanNode::CreateTree()
          @          0x12f4ca7  impala::FragmentState::Init()
          @          0x12f839c  impala::FragmentState::CreateFragmentStateMap()
          @          0x126cedb  impala::QueryState::StartFInstances()
          @          0x125c4df  impala::QueryExecMgr::ExecuteQueryHelper()
      

       

       

      Some notes about the repro:

      • The inside of the select (without filtering on the unnested value) is OK.
      • If I unnest only one array then this is OK.
      • If I remove the IN clause from the view’s DDL then the query runs well.

       

      Update:

      I managed to do a repro without creating an actual view. This might reduce the complexity with the tuple/slot IDs for the investigation.

      select id, unnested_arr1, unnested_arr2 from (
      select id, unnest(arr1) as unnested_arr1, unnest(arr2) as unnested_arr2
        from functional_parquet.complextypes_arrays
        where id in (select id from functional_parquet.alltypestiny)) a
      where a.unnested_arr1 < 5 

      Attachments

        Activity

          People

            gaborkaszab Gabor Kaszab
            gaborkaszab Gabor Kaszab
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: