Description
When we use CTAS to create a new table in s3, the table location is not set correctly. As a result, the data from the existing table cannot be inserted into the new created table.
We can use the following example to reproduce this issue.
set hive.metastore.warehouse.dir=OUTPUT_PATH;
drop table s3_dir_test;
drop table s3_1;
drop table s3_2;
create external table s3_dir_test(strct struct<a:int, b:string, c:string>)
row format delimited
fields terminated by '\t'
collection items terminated by ' '
location 'INPUT_PATH';
create table s3_1(strct struct<a:int, b:string, c:string>)
row format delimited
fields terminated by '\t'
collection items terminated by ' ';
insert overwrite table s3_1 select * from s3_dir_test;
select * from s3_1;
create table s3_2 as select * from s3_1;
select * from s3_1;
select * from s3_2;
The data could be as follows.
1 abc 10.5
2 def 11.5
3 ajss 90.23232
4 djns 89.02002
5 random 2.99
6 data 3.002
7 ne 71.9084
The root cause is that the SemanticAnalyzer class did not handle s3 location properly for CTAS.
A patch will be provided shortly.
Attachments
Attachments
Issue Links
- is required by
-
SPARK-21514 Hive has updated with new support for S3 and InsertIntoHiveTable.scala should update also
- Resolved
- links to