Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Noticed the following queries can give different results:
select count(*) from tbl; select count(*) from (select * from tbl distribute by rand()) a;
Attachments
Issue Links
- causes
-
KYLIN-3388 Data may become not correct if mappers fail during the redistribute step, "distribute by rand()"
- Closed
- is related to
-
SPARK-24607 Distribute by rand() can lead to data inconsistency
- Resolved
-
HIVE-13108 Operators: SORT BY randomness is not safe with network partitions
- Closed
-
HIVE-20220 Incorrect result when hive.groupby.skewindata is enabled
- Patch Available