Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.1.0, 4.0.0
Description
The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and the result Is not correct, for example:
CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc;
CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc;
insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333');
insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333');
SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x b (b.key);
Hive on Tez result: wrong
a.key | b.key |
aaa | aaa |
bbb | NULL |
ccc | ccc |
NULL | ddd |
------------------
Hive on MR result: right
a.key | b.key |
aaa | aaa |
bbb | NULL |
ccc | ccc |
-----------------
SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key);
Hive on Tez result: wrong
-------------------
a.key | b.key |
aaa | aaa |
bbb | NULL |
ccc | ccc |
NULL | ddd |
-----------------
Hive on MR result: right
a.key | b.key |
aaa | aaa |
ccc | ccc |
Attachments
Issue Links
- links to