Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3311

String data coming out of agg can be corrupted by blocking operators

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.5.0
    • Impala 2.6.0
    • Backend

    Description

      Varlen data (e.g. strings) produced by aggregations is freed after passing up the output (https://github.com/cloudera/Impala/blob/cdh5-trunk/be/src/exec/partitioned-aggregation-node.cc#L1353). This works fine for streaming operators or blocking operators that copy their input, but results in memory corruption when the output reaches non-copying blocking operators.

      Repro
      Build ASAN, start an impalad with the -disable_mem_pools flag, and run the following query:

      select id, m from functional_parquet.complextypestbl t,
      (select min(cast(item as string)) m from t.int_array) v
      

      I've attached the ASAN output from running this query (asan_output.txt).

      Symptoms
      If the query plan contains an aggregation node producing string values anywhere within a subplan (i.e. if in the SQL statement, the aggregate function appears within an inline view over a collection column), the results of the aggregation may be incorrect.

      Attachments

        1. asan_output.txt
          17 kB
          Skye Wanderman-Milne

        Activity

          People

            skye Skye Wanderman-Milne
            skye Skye Wanderman-Milne
            Votes:
            2 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: