Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4554

Memory corruption of nested collection with MT_DOP > 0.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:

      Description

      A query that explodes and aggregates a nested collection appears to hang with mt_dop>0.

      The reason is that the the in-memory collection slot has a "garbage" size that leads to infinite unnesting. It looks like the memory gets corrupted at some point, but I have not been able to make out where exactly yet.

      Start Impala like this:

      bin/start-impala-cluster.py -s 1 --impalad_args="--default_query_options=mt_dop=1"
      

      Query to repro:

      select id, cnt from functional_parquet.complextypestbl t,
      (select count(item) cnt from t.int_array) v;
      

      Relevant code snippet from unnest-node.cc

        Tuple* tuple = containing_subplan_->current_input_row_->GetTuple(coll_tuple_idx_);
        if (tuple != NULL) {
          // Retrieve the collection value to be unnested directly from the tuple. We purposely
          // ignore the null bit of the slot because we may have set it in a previous Open() of
          // this same unnest node for projection.
          coll_value_ = reinterpret_cast<const CollectionValue*>(
              tuple->GetSlot(coll_slot_desc_->tuple_offset()));
          // Projection: Set the slot containing the collection value to NULL.
          tuple->SetNull(coll_slot_desc_->null_indicator_offset());
        } else {
          coll_value_ = &EMPTY_COLLECTION_VALUE;
          DCHECK_EQ(coll_value_->num_tuples, 0);
        }
      
      At this point coll_value_->num_tuples appears to be garbage.
      

        Activity

        Hide
        alex.behm Alexander Behm added a comment -

        I was also able to get some interesting use-after-free info from ASAN when running test_nested_types.py with MT_DOP=3. I have not been able to pinpoint the problematic query yet.

        Show
        alex.behm Alexander Behm added a comment - I was also able to get some interesting use-after-free info from ASAN when running test_nested_types.py with MT_DOP=3. I have not been able to pinpoint the problematic query yet.
        Hide
        kwho Michael Ho added a comment -

        Mind posting the backtrace ? Just curious.

        Show
        kwho Michael Ho added a comment - Mind posting the backtrace ? Just curious.
        Hide
        tarmstrong Tim Armstrong added a comment -

        This looks like it's actually the same bug as http://gerrit.cloudera.org:8080/859 - the fix just wasn't ported to the mt scan node.

        Show
        tarmstrong Tim Armstrong added a comment - This looks like it's actually the same bug as http://gerrit.cloudera.org:8080/859 - the fix just wasn't ported to the mt scan node.
        Hide
        tarmstrong Tim Armstrong added a comment -

        IMPALA-4554: fix projection of nested collections with mt_dop > 0

        Change-Id: I42e72eae8dfa78f7d53708eb8f2f0da8b780692d
        Reviewed-on: http://gerrit.cloudera.org:8080/5270
        Tested-by: Internal Jenkins
        Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>

        M be/src/exec/hdfs-scan-node-mt.cc
        M be/src/exec/unnest-node.cc
        M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
        3 files changed, 21 insertions, 0 deletions

        Approvals:
        Internal Jenkins: Verified
        Tim Armstrong: Looks good to me, approved

        Show
        tarmstrong Tim Armstrong added a comment - IMPALA-4554 : fix projection of nested collections with mt_dop > 0 Change-Id: I42e72eae8dfa78f7d53708eb8f2f0da8b780692d Reviewed-on: http://gerrit.cloudera.org:8080/5270 Tested-by: Internal Jenkins Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> — M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/unnest-node.cc M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test 3 files changed, 21 insertions , 0 deletions Approvals: Internal Jenkins: Verified Tim Armstrong: Looks good to me, approved

          People

          • Assignee:
            tarmstrong Tim Armstrong
            Reporter:
            alex.behm Alexander Behm
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development