Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.2.1
-
None
Description
The HQL syntax is like this:
CREATE TEMPORARY TABLE tez_union_all_loss_data AS
SELECT xxx, yyy, zzz,1 as tag
FROM ods_1
UNION ALL
SELECT xxx, yyy, zzz, tag
FROM
(
SELECT xxx
,get_json_object(get_json_object(tb,'$.a'),'$.b') AS yyy
,zzz
,2 as tag
FROM ods_2
LATERAL VIEW EXPLODE(some_udf(uuu)) team_number AS tb
) tbl
;
With above HQL, we are expecting that rows with both tag = 2 and tag = 1 appear. In our case however, all the rows with tag = 1 are lost.
Dig deeper we can find that the generated two maps have identical task tmp paths. And that results from when UDTF is present, the FileSinkOperator will be processed twice generating the tmp path in GenTezUtils.removeUnionOperators();
Attachments
Attachments
Issue Links
- Is contained by
-
HIVE-26751 Bug Fixes and Improvements for 3.2.0 release
- Open