Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10325

Parquet scan should use min/max statistics to skip pages based on equi-join predicate

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • Impala 4.0.0
    • Backend
    • None
    • ghx-label-11

    Description

      Parquet stores min/max stats for pages which can be used to skip certain pages if they don't qualify an equi-join predicate.

      The query below ends up scanning all rows for table a, which may not be needed if the min/max of b.ss_addr_sk can be detected and used during the scan of a.

      select a.ss_sold_time_sk from
      store_sales a join [SHUFFLE] store_sales b
      where a.ss_addr_sk = b.ss_addr_sk and
      b.ss_customer_sk < 10
      ;
      

      Attachments

        Issue Links

          Activity

            People

              sql_forever Qifan Chen
              sql_forever Qifan Chen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: