Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10229

Analytic limit pushdown optimization can be applied incorrectly based on predicates present

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • Impala 4.0.0
    • Frontend

    Description

      [localhost.EXAMPLE.COM:21050] default> select * from (select month, id, rank() over (partition by month order by id desc) rnk from functional_parquet.alltypes WHERE month >= 11) v order by month, id limit 3;
      +-------+------+-----+
      | month | id   | rnk |
      +-------+------+-----+
      | 11    | 6987 | 3   |
      | 11    | 6988 | 2   |
      | 11    | 6989 | 1   |
      +-------+------+-----+
      Fetched 3 row(s) in 4.16s
      

      These are not the top 3 rows when ordering by month, id . Hive's result is correct:

      +----------+-------+--------+
      | v.month  | v.id  | v.rnk  |
      +----------+-------+--------+
      | 11       | 3040  | 600    |
      | 11       | 3041  | 599    |
      | 11       | 3042  | 598    |
      +----------+-------+--------+
      

      I think when there's no select predicates, that the ordering in the analytic sort needs to exactly match the TOP N sort ordering. I'm not sure if there are fixes needed for the case where there are select predicates.

      Attachments

        Activity

          People

            tarmstrong Tim Armstrong
            tarmstrong Tim Armstrong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: