Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.8.0
-
None
-
None
Description
When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names:
Before INSERT INTO:
000000_0.gz
After INSERT INTO:
000000_0.gz
000000_0.gz_copy_1
This causes corrupted output when doing a SELECT * on the table.
Correct behavior should be to pick a valid filename such as:
000000_0_copy_1.gz