Details
-
Bug
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
1.1.1, 2.3.4, 4.0.0
-
None
-
None
Description
Inserting overwrite static partitions may not clean related HDFS location if partitions' info is not stored in metadata.
Steps to reproduce this issue :
------------------------------------------------
1. Create a managed table :
------------------------------------------------
CREATE TABLE `test`( `id` string) PARTITIONED BY ( `dayno` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' TBLPROPERTIES ( 'transient_lastDdlTime'='1564731656')
------------------------------------------------
2. Create partition's directory and put some data in it
------------------------------------------------
hdfs dfs -mkdir hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 hdfs dfs -put test.data hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802
------------------------------------------------
3. Insert overwrite partition dayno=20190802
------------------------------------------------
INSERT OVERWRITE TABLE test PARTITION(dayno='20190802') SELECT "some value";
------------------------------------------------
4. We could see the test.data under partition directory is not deleted.
------------------------------------------------