Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
There were multiple patches targeting an issue when INSERT OVERWRITE was ineffective if the input is empty:
HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting
HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input is empty
HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the input is empty
From these patches, HIVE-21714 seems to have a bad effect on external tables, because of this part:
https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
The original issue before HIVE-21714 was that the original files in the table survived an insert overwrite, and select>0 was after that. HIVE-21714 seems to enable writing empty files regardless of execution engine / table type, which is not the proper way, as the proper solution would be to completely avoid writing empty files for Tez (this is what HIVE-14014 was about). I found that changing condition to...
if (!isTez && (isStreaming || this.isInsertOverwrite))
(which could be an easy solution for external tables) breaks some test cases (both full ACID and MM) in insert_overwrite.q, which could mean they rely somehow on the empty generated file. We need to find a proper solution which is applicable for all table types without polluting external tables.
Attachments
Attachments
Issue Links
- is caused by
-
HIVE-21714 Insert overwrite on an acid/mm table is ineffective if the input is empty
- Closed
- is related to
-
HIVE-22918 Investigate empty bucket file creation for ACID tables
- Resolved
- relates to
-
HIVE-22949 Handle empty insert overwrites without inserting an empty file in case of acid/mm tables
- In Progress