Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9183

TPC-DS query 13 - customer_address predicates not propagated to scan

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 3.4.0
    • Impala 4.0.0
    • Frontend
    • ghx-label-2

    Description

      TPC-DS query 13 has a set of predicates on the customer_address table, ca_state column that are currently evaluated after the join of customer_address and store_sales.   The ca_state predicates could be pushed down to the customer_address scan node.  This would reduce the size of the join input by a factor of 3.4.

      As an experiment I added an additional redundant predicate to the query (see attached query13_mod.sql) which causes the planner to evaluate the predicate at the scan node. 

      Performance of the original and modified queries at 10 TB scale factor:

      Original:  164 seconds

      Modified: 44 seconds

      Query profiles for both versions attached.

      Attachments

        1. q13_plan.png
          163 kB
          David Rorke
        2. q13_mod_plan.png
          166 kB
          David Rorke
        3. profile_q13.txt
          636 kB
          David Rorke
        4. profile_q13_mod.txt
          629 kB
          David Rorke
        5. query13.sql
          2 kB
          David Rorke
        6. query13_mod.sql
          2 kB
          David Rorke

        Issue Links

          Activity

            People

              amansinha Aman Sinha
              drorke David Rorke
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: