Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1212

Remove extra exchange operator when generate a two phase aggregation plan.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.4.0
    • None
    • None

    Description

      We may see an extra hash exchange operator, when optimizer choose a two phase aggregation plan, for a query with group by GB_Key order by OB_Key and GB_key is same as Ob_key.

      One example of such plan with extra exchange is TPCH Q12:

      select
      l.l_shipmode,
      sum(case
      when o.o_orderpriority = '1-URGENT'
      or o.o_orderpriority = '2-HIGH'
      then 1
      else 0
      end) as high_line_count,
      sum(case
      when o.o_orderpriority <> '1-URGENT'
      and o.o_orderpriority <> '2-HIGH'
      then 1
      else 0
      end) as low_line_count
      from
      cp.`tpch/orders.parquet` o,
      cp.`tpch/lineitem.parquet` l
      where
      o.o_orderkey = l.l_orderkey
      and l.l_shipmode in ('TRUCK', 'REG AIR')
      and l.l_commitdate < l.l_receiptdate
      and l.l_shipdate < l.l_commitdate
      and l.l_receiptdate >= date '1994-01-01'
      and l.l_receiptdate < date '1994-01-01' + interval '1' year
      group by
      l.l_shipmode
      order by
      l.l_shipmode;

      The issue is caused by the rule does not propagate the hash distribution up, and hence introduce unnecessary hash exchange.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jni Jinfeng Ni
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: