Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.1.0
-
None
-
None
Description
Example is the bellow. Each subquery 'key' column is unique. But when they are joined on 'key' column, a result is duplicated.
CREATE TABLE test ( key STRING, type STRING, value INT ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'; LOAD DATA LOCAL INPATH '/tmp/test.gz' OVERWRITE INTO TABLE test; SELECT * FROM test; A type1 30000 B type2 20000 C type2 5000 SELECT l.* FROM ( SELECT * FROM test LATERAL VIEW explode(ARRAY(key)) e AS dammy ) l JOIN ( SELECT key, rank() OVER (PARTITION BY type ORDER BY value DESC) rank FROM test ) r ON l.key = r.key A type1 30000 A A type1 30000 A B type2 20000 B B type2 20000 B C type2 5000 C C type2 5000 C