Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2783

Push down filters on rank similar to limit

    XMLWordPrintableJSON

Details

    Description

      Similar to limit push down optimization we should extend the rule to cover filters on Rank(), dense_rank() etc... as users tend to have explicit filters on RANK()

      Query

      select *
      FROM   (SELECT Rank()
      OVER(
      ORDER BY  l_orderkey) AS rank
      FROM   lineitem
      WHERE  l_shipdate < '1992-05-09') a
      WHERE  rank < 10
      

      Plan

      +--------------------------------------------------------------+
      | Explain String                                               |
      +--------------------------------------------------------------+
      | Estimated Per-Host Requirements: Memory=512.00MB VCores=1    |
      |                                                              |
      | 03:SELECT                                                    |
      | |  predicates: rank() < 10                                   |
      | |  hosts=9 per-host-mem=unavailable                          |
      | |  tuple-ids=6,5 row-size=50B cardinality=17999891           |
      | |                                                            |
      | 02:ANALYTIC                                                  |
      | |  functions: rank()                                         |
      | |  order by: l_orderkey ASC                                  |
      | |  window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
      | |  hosts=9 per-host-mem=unavailable                          |
      | |  tuple-ids=6,5 row-size=50B cardinality=179998909          |
      | |                                                            |
      | 04:MERGING-EXCHANGE [UNPARTITIONED]                          |
      | |  order by: l_orderkey ASC                                  |
      | |  hosts=9 per-host-mem=unavailable                          |
      | |  tuple-ids=6 row-size=38B cardinality=179998909            |
      | |                                                            |
      | 01:SORT                                                      |
      | |  order by: l_orderkey ASC                                  |
      | |  hosts=9 per-host-mem=336.00MB                             |
      | |  tuple-ids=6 row-size=38B cardinality=179998909            |
      | |                                                            |
      | 00:SCAN HDFS [tpch_300_parquet.lineitem, RANDOM]             |
      |    partitions=1/1 files=264 size=64.36GB                     |
      |    predicates: l_shipdate < '1992-05-09'                     |
      |    table stats: 1799989091 rows total                        |
      |    column stats: all                                         |
      |    hosts=9 per-host-mem=176.00MB                             |
      |    tuple-ids=0 row-size=38B cardinality=179998909            |
      +--------------------------------------------------------------+
      

      Attachments

        Issue Links

          Activity

            People

              amansinha Aman Sinha
              mmokhtar Mostafa Mokhtar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: