Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Consider:
create table T stored as ORC TBLPROPERTIES('transactional'='true') as select a, b from A where a <= 5 union all select a, b from B where a >= 5
and
create table T (a int, b int) stored as ORC TBLPROPERTIES ('transactional'='false'; insert into T(a,b) select a, b from T where a between 1 and 3 group by a, b union all select a, b from A where a between 5 and 7 union all select a, b from B where a >= 9
On Tez, there is an optimization that removes Union All operator writes the data into
subdirectories of T (in this case T is unpartitioned).
This also happens on MR but requires
hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_OPTIMIZE_UNION_REMOVE, true); hiveConf.setVar(HiveConf.ConfVars.HIVEFETCHTASKCONVERSION, "none");
Need to ensure that when target table is Acid, we generate unique ROW__IDs
When target is not acid, that we can convert it to Acid via Alter Table even when data layout includes subdirectories.
Attachments
Attachments
Issue Links
- is blocked by
-
HIVE-17204 support un-bucketed tables in acid
- Resolved
- is related to
-
HIVE-18021 Insert overwrite on acid table with Union All optimizations
- Open
-
HIVE-16177 non Acid to acid conversion doesn't handle _copy_N files
- Closed
- links to