Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.2.4
Description
In certain workflows, it's useful to do an insert into a temp table and then use the LOAD statement to move the data from the temp table to a main table. This workflow fails due to the _impala_insert_staging table which is a pain to delete and for workflows where this is common, invoking hadoop fs to delete the directory, can consume a large amount of time.
e.g.
drwxrwxrwt - impala hive 0 2016-01-15 16:55 /user/hive/warehouse/test1/_impala_insert_staging -rw-r--r-- 3 impala hive 3 2016-01-15 16:55 /user/hive/warehouse/test1/b45b4bccab740ab-c119e053e1efd6a2_528952759_data.0
and
> load data inpath '/user/hive/warehouse/test1/' into table test2; Query: load data inpath '/user/hive/warehouse/test1/' into table test2 ERROR: AnalysisException: INPATH location 'hdfs://nn:8020/user/hive/warehouse/test1' cannot contain subdirectories.