Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
-
ghx-label-12
Description
[localhost.EXAMPLE.COM:21050] default> select * from (select month, id, rank() over (partition by month order by id desc) rnk from functional_parquet.alltypes WHERE month >= 11) v order by month, id limit 3; +-------+------+-----+ | month | id | rnk | +-------+------+-----+ | 11 | 6987 | 3 | | 11 | 6988 | 2 | | 11 | 6989 | 1 | +-------+------+-----+ Fetched 3 row(s) in 4.16s
These are not the top 3 rows when ordering by month, id . Hive's result is correct:
+----------+-------+--------+ | v.month | v.id | v.rnk | +----------+-------+--------+ | 11 | 3040 | 600 | | 11 | 3041 | 599 | | 11 | 3042 | 598 | +----------+-------+--------+
I think when there's no select predicates, that the ordering in the analytic sort needs to exactly match the TOP N sort ordering. I'm not sure if there are fixes needed for the case where there are select predicates.