Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
-
None
-
None
Description
This bug is a follow-on from KUDU-1299.
Although I could INSERT...SELECT from my Impala table into an unpartitioned Kudu table (after doing away with a TIMESTAMP column that caused an error), when I tried the exact same operation where the destination was a partitioned Kudu table, then I got another conversion error saying the Impala BOOLEAN type wasn't supported. Again, the Kudu docs say the BOOLEAN type should work:
create TABLE log_ingest_docs_kudu2 ( id bigint , ip string , f2 string , f3 string , the_date string , method string , path string , status smallint , size bigint , referer string , agent string , is_search_term boolean , search_term string , is_doc_page boolean , doc_page string) DISTRIBUTE BY HASH (id) INTO 10 BUCKETS TBLPROPERTIES ( 'kudu.master_addresses'='yadayada:7051' , 'kudu.key_columns'='id' , 'kudu.table_name'='log_ingest_docs_kudu2' , 'storage_handler'='com.cloudera.kudu.hive.KuduStorageHandler' ); insert into log_ingest_docs_kudu2 select * from log_ingest_docs_kudu; WARNINGS: Impala type BOOLEAN is not available in Kudu. Impala type BOOLEAN is not available in Kudu. select count(*) from log_ingest_docs_kudu2; +----------+ | count(*) | +----------+ | 0 | +----------+ select distinct is_search_term, is_doc_page from log_ingest_docs_kudu; WARNINGS: Impala type BOOLEAN is not available in Kudu. Impala type BOOLEAN is not available in Kudu.
Notice a couple of things:
- In this case, the source table is another Kudu table, not the original Impala table. (How does Kudu know at that point that the BOOLEAN data inside a Kudu table is the Impala BOOLEAN type?)
- The only difference between table log_ingest_docs_kudu (where the INSERT...SELECT worked) and log_ingest_docs_kudu2 (where it failed) is the DISTRIBUTE BY HASH added to the latter.
- The BOOLEAN message is stated to be a warning, yet it caused the operation to fail, since no rows are inserted into the destination table.
- If I go back to log_ingest_docs_kudu where the data (including the BOOLEAN column) was successfully transferred, I find that I can't do any query that references the BOOLEAN column. So the problem manifests itself consistently in the SELECT statement, and inconsistently in the INSERT statement (depends on whether the table has a DISTRIBUTE BY clause or not).