Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1386

Bad plan generated for TPC-DS query 1

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.0
    • Impala 2.0.1
    • None
    • None

    Description

      I added a partition filter (predicate on sr_returned_date_sk) to TPC-DS query 1 to limit the scan of the store_returns table. The generated plan has a couple problems:
      (1) The store_returns table is scanned twice (self join in query)
      (2) One of the scans ignores the partition filter

      Query:

      explain with customer_total_return as
      (select sr_customer_sk as ctr_customer_sk
        ,sr_store_sk as ctr_store_sk
        ,sum(SR_RETURN_TAX) as ctr_total_return
      from store_returns
        ,date_dim
      where sr_returned_date_sk = d_date_sk
      and d_year =2002
      and sr_returned_date_sk between 2452276 and 2452640
      group by sr_customer_sk
      ,sr_store_sk)
      select * from ( select  c_customer_id
      from customer_total_return ctr1
        ,store
        ,customer
      where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2
      from customer_total_return ctr2
      where ctr1.ctr_store_sk = ctr2.ctr_store_sk)
      and s_store_sk = ctr1.ctr_store_sk
      and s_state = 'LA'
      and ctr1.ctr_customer_sk = c_customer_sk
      order by c_customer_id
      ) as big_return_customers limit 100;
      

      Plan is attached.

      Attachments

        1. q1.profile
          643 kB
          David Rorke
        2. q1.plan
          13 kB
          David Rorke

        Activity

          People

            dtsirogiannis Dimitris Tsirogiannis
            drorke David Rorke
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: