-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 0.8.0
-
Fix Version/s: 0.8.0
-
Component/s: None
-
Labels:None
When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names:
Before INSERT INTO:
000000_0.gz
After INSERT INTO:
000000_0.gz
000000_0.gz_copy_1
This causes corrupted output when doing a SELECT * on the table.
Correct behavior should be to pick a valid filename such as:
000000_0_copy_1.gz