Details
-
Improvement
-
Status: Open
-
Blocker
-
Resolution: Unresolved
-
Impala 2.2.4
-
None
Description
I expect that by far the most common use case for CREATE TABLE LIKE PARQUET is to make a table where the specified Parquet file will be queried. That is, either:
CREATE TABLE foo LIKE PARQUET '/blah/blah/file.parq' STORED AS PARQUET;
LOAD DATA INFILE '/blah/blah/file.parq' INTO TABLE foo;
or
CREATE EXTERNAL TABLE foo LIKE PARQUET '/blah/blah/file.parq' STORED AS PARQUET LOCATION '/blah/blah';
I have difficulty imagining a case where someone would do CREATE TABLE LIKE PARQUET and want the result to be a text table. Even if someone planned to convert Parquet -> text, they would need to have a Parquet table to begin with, in which case they would do CREATE TABLE text_table LIKE parquet_table, not CREATE TABLE LIKE PARQUET.
It is easy to leave off the STORED AS PARQUET clause by mistake from a CTLP statement, because PARQUET already occurs earlier in the statement, resulting in a text table that throws conversion errors when queried. How about making Parquet the default format in this case, and requiring the STORED AS clause only to use a different file format? (Then if Impala implemented a CREATE TABLE LIKE AVRO syntax, the default in that case would be Avro.)
Since I guess this would qualify as an incompatible change, we would need to think through the appropriate release vehicle.
Attachments
Issue Links
- is related to
-
IMPALA-4753 Table created like parquet file shows wrong row count
- Resolved