Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
During the process of partition registration via thrift api we are noticing that the HDFS file path associated is being deleted even though the path was not created by the same process.
This results in loss of data in the dir path. In the below example there are 3 threads that is trying to create a dir and only one of succeeds in registering a partition , resulting the other 2 threads deleting the directory created and registered by the original thread.
hadoop-cmf-hive-HIVEMETASTORE-******.41:2020-07-02 08:50:31,307 INFO org.apache.hadoop.hive.common.FileUtils: [pool-5-thread-379217]: Creating directory if it doesn't exist: hdfs://test_path/dt=2020-07-02/hhmm-0850
hadoop-cmf-hive-HIVEMETASTORE-******.41:2020-07-02 08:50:31,308 INFO org.apache.hadoop.hive.common.FileUtils: [pool-5-thread-386717]: Creating directory if it doesn't exist: hdfs://test_path/dt=2020-07-02/hhmm-0850
hadoop-cmf-hive-HIVEMETASTORE-******.41:2020-07-02 08:50:31,308 INFO org.apache.hadoop.hive.common.FileUtils: [pool-5-thread-379074]: Creating directory if it doesn't exist: hdfs://test_path/dt=2020-07-02/hhmm-0850
hadoop-cmf-hive-HIVEMETASTORE-******.41:2020-07-02 08:50:31,314 INFO hive.metastore.hivemetastoressimpl: [pool-5-thread-386717]: deleting hdfs://test_path/dt=2020-07-02/hhmm-0850
hadoop-cmf-hive-HIVEMETASTORE-******.41:2020-07-02 08:50:31,315 INFO hive.metastore.hivemetastoressimpl: [pool-5-thread-379217]: deleting hdfs://test_path/dt=2020-07-02/hhmm-0850
hadoop-cmf-hive-HIVEMETASTORE-******.41:2020-07-02 08:50:31,321 INFO org.apache.hadoop.fs.TrashPolicyDefault: [pool-5-thread-386717]: Moved: 'hdfs://test_path/dt=2020-07-02/hhmm-0850' to trash at: hdfs://user/test/.Trash/Current/test/dt=2020-07-02/hhmm=0850
hadoop-cmf-hive-HIVEMETASTORE-******.41:2020-07-02 08:50:31,321 INFO hive.metastore.hivemetastoressimpl: [pool-5-thread-386717]: Moved to trash: hdfs://test_path/dt=2020-07-02/hhmm-0850
hadoop-cmf-hive-HIVEMETASTORE-******.41:2020-07-02 08:50:31,323 ERROR hive.log: [pool-5-thread-379217]: Got exception: java.io.IOException Failed to move to trash: hdfs://test_path/dt=2020-07-02/hhmm-0850
hadoop-cmf-hive-HIVEMETASTORE-******.41:java.io.IOException: Failed to move to trash: hdfs://test_path/dt=2020-07-02/hhmm-0850
hadoop-cmf-hive-HIVEMETASTORE-******.41:2020-07-02 08:50:31,328 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-379217]: MetaException(message:Got exception: java.io.IOException Failed to move to trash: hdfs://test_path/dt=2020-07-02/hhmm-0850)