Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6258

Uninitialized tuple pointers in row batch for empty rows

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Impala 2.11.0
    • Fix Version/s: Impala 2.12.0
    • Component/s: Backend
    • Labels:

      Description

      During code review of IMPALA-6187, it was noticed that the tuple pointers in the generated row batches may not be initialized if a tuple has byte size 0. It's unclear if there may be edge cases in which the code may be de-referencing these uninitialized tuple pointers. In addition, there are some codes which compare these uninitialized pointers agains the NULL value so having them uninitialized may return wrong (and non-deterministic) results:

      BooleanVal TupleIsNullPredicate::GetBooleanVal(
          ScalarExprEvaluator* evaluator, const TupleRow* row) const {
        int count = 0;
        for (int i = 0; i < tuple_idxs_.size(); ++i) {
          count += row->GetTuple(tuple_idxs_[i]) == NULL;
        }
        // Return true only if all originally specified tuples are NULL. Return false if any
        // tuple is non-nullable.
        return BooleanVal(count == tuple_ids_.size());
      }
      

      Tim Armstrong came up with the following example:

        SELECT /* +straight_join */ COUNT(t1.id)
        FROM functional.alltypessmall t1
        LEFT OUTER JOIN (
          SELECT /* +straight_join */ IFNULL(t2.int_col, 1) AS c
          FROM functional.alltypessmall t2
          LEFT OUTER JOIN functional.alltypestiny t3 ON t2.id < 1000
        ) v ON t1.int_col = v.c;
      The relevant part of the plan is:
          | 04:HASH JOIN [LEFT OUTER JOIN, PARTITIONED]                                         |
          | |  hash predicates: t1.int_col = if(TupleIsNull(1, 2), NULL, ifnull(t2.int_col, 1)) |
          | |  fk/pk conjuncts: assumed fk/pk                                                   |
          | |  mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB                  |
          | |  tuple-ids=0,1N,2N row-size=16B cardinality=100                                   |
          | |                                                                                   |
          | |--08:EXCHANGE [HASH(if(TupleIsNull(1, 2), NULL, ifnull(t2.int_col, 1)))]           |
          | |  |  mem-estimate=0B mem-reservation=0B                                            |
          | |  |  tuple-ids=1,2N row-size=8B cardinality=100                                    |
          | |  |                                                                                |
          | |  F01:PLAN FRAGMENT [RANDOM] hosts=3 instances=3                                   |
          | |  Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B                      |
          | |  03:NESTED LOOP JOIN [LEFT OUTER JOIN, BROADCAST]                                 |
          | |  |  join predicates: t2.id < 1000                                                 |
          | |  |  mem-estimate=0B mem-reservation=0B                                            |
          | |  |  tuple-ids=1,2N row-size=8B cardinality=100                                    |
          | |  |                                                                                |
          | |  |--06:EXCHANGE [BROADCAST]                                                       |
          | |  |  |  mem-estimate=0B mem-reservation=0B                                         |
          | |  |  |  tuple-ids=2 row-size=0B cardinality=8                                      |
          | |  |  |                                                                             |
          | |  |  F02:PLAN FRAGMENT [RANDOM] hosts=3 instances=3                                |
          | |  |  Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B                   |
          | |  |  02:SCAN HDFS [functional.alltypestiny t3, RANDOM]                             |
          | |  |     partitions=4/4 files=4 size=460B                                           |
          | |  |     stats-rows=8 extrapolated-rows=disabled                                    |
          | |  |     table stats: rows=8 size=unavailable                                       |
          | |  |     column stats: all                                                          |
          | |  |     mem-estimate=32.00MB mem-reservation=0B                                    |
          | |  |     tuple-ids=2 row-size=0B cardinality=8                                      |
          | |  |                                                                                |
          | |  01:SCAN HDFS [functional.alltypessmall t2, RANDOM]                               |
          | |     partitions=4/4 files=4 size=6.32KB                                            |
          | |     stats-rows=100 extrapolated-rows=disabled                                     |
          | |     table stats: rows=100 size=unavailable                                        |
          | |     column stats: all                                                             |
          | |     mem-estimate=32.00MB mem-reservation=0B                                       |
          | |     tuple-ids=1 row-size=8B cardinality=100                                       |
           
      

      We should fix them by setting these empty tuples with a dummy non-NULL pointer.

      Alex came up with this query that produces non-deterministic results currently:

      select count(v.x) from functional.alltypestiny t3 left outer join (select true as x from functional.alltypestiny t1 left outer join functional.alltypestiny t2 on (true)) v on (v.x = t3.bool_col) where t3.bool_col = true;
      
      select count(v.x) from functional_kudu.alltypestiny t3 left outer join (select true as x from functional_kudu.alltypestiny t1 left outer join functional_kudu.alltypestiny t2 on (true)) v on (v.x = t3.bool_col) where t3.bool_col = true;
      

        Attachments

          Activity

            People

            • Assignee:
              boroknagyz Zoltán Borók-Nagy
              Reporter:
              kwho Michael Ho
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: