Description
Insert overwrite table query does not generate correct task plan when hive.optimize.union.remove and hive.merge.sparkfiles properties are ON.
set hive.optimize.union.remove=true set hive.merge.sparkfiles=true insert overwrite table outputTbl1 SELECT * FROM ( select key, 1 as values from inputTbl1 union all select * FROM ( SELECT key, count(1) as values from inputTbl1 group by key UNION ALL SELECT key, 2 as values from inputTbl1 ) a )b; select * from outputTbl1 order by key, values;
query result
1 1 1 2 2 1 2 2 3 1 3 2 7 1 7 2 8 2 8 2 8 2
expected result:
1 1 1 1 1 2 2 1 2 1 2 2 3 1 3 1 3 2 7 1 7 1 7 2 8 1 8 1 8 2 8 2 8 2
Move work is not working properly and some data are missing during move.