Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3332

Sorter does not do query maintenance correctly

    XMLWordPrintableJSON

Details

    Description

      I ran this query on tpch_parquet:

      select * from (select l_orderkey, l_returnflag, l_linestatus, '' from lineitem union all select l_orderkey, l_returnflag, l_linestatus, '' from lineitem union all select l_orderkey, l_returnflag, l_linestatus, '' from lineitem union all select l_orderkey, l_returnflag, l_linestatus, '' from lineitem) x order by l_orderkey
      

      Even after it was running for 10+ seconds I didn't see any updates in the execution summary.

      I ran a slightly different query which had runaway memory usage, I presume because it didn't free local allocations

      select * from (select *, first_value(col) over (order by sort_col) fv from (select concat(l_linestatus, repeat('a', 2047)) sort_col, substr(l_returnflag, 1, 0) col from lineitem) q) q2 where fv != ''
      

      All of the allocations were this call stack:

      Sorter::TupleSorter::Partition
      codegened function
      impala_udf::StringVal
      FunctionContextImpl::AllocateLocal
      

      Attachments

        Activity

          People

            kwho Michael Ho
            tarmstrong Tim Armstrong
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: