Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
Each order key by can be specified with ascending or descending order.
Recently, I found that ORDER BY with the first descending order key causes wrong result.
If second key is a descending order, it works well. Other cases work correctly.
select l_orderkey, l_partkey from lineitem order by l_orderkey, l_partkey desc; l_orderkey, l_partkey ------------------------------- 1, 155190 1, 67310 1, 63700 1, 24027 1, 15635 1, 2132 2, 106170 3, 183095 3, 128449 3, 62143 3, 29380 3, 19036 3, 4297 ...
But, if the first sort key is a descending order, it causes wrong row number and shows wrong range part as follows:
default> select l_orderkey, l_partkey from lineitem order by l_orderkey desc, l_partkey;
l_orderkey, l_partkey
-------------------------------
3000000, 61045
3000000, 159113
3000000, 167695
3000000, 167904
3000000, 196339
...
According to my investigation, it seems to be related to offset problem of RowFile or index problem. The final result includes duplicated rows and the final row was wrong as follows:
part-02-000000-000
3000000|61045 3000000|159113 3000000|167695 3000000|167904 3000000|196339 2999975|28334 2999975|194023 2999974|8020 2999974|124152 2999974|129921 2999974|139248 2999974|168914 2999974|187923 2999973|30533 2999973|36196 ... 2919713|133486 2919713|195963 2919712|86257 2919712|94542 2919712|107370 2919712|166342 <- duplicated rows 2919712|178277 .... 1|63700 1|67310 1|155190 [EOF]
part-02-000001-000
|96127 <- looks wrong 6000000|32255 6000000|96127 5999975|6452 5999975|7272 5999975|37131 .... .... 2919713|133486 2919713|195963 2919712|94542 2919712|107370 2919712|166342 <- duplicated rows [EOF]