Details
Description
I am trying to export a String column from Hive ORC table into Oracle CLOB column using HCatalog in Sqoop 1.4.6 (both HDP 2.4 and HDP 2.6).
- HDP 2.4: Sqoop 1.4.6.2.4.3.0-227
- HDP 2.6: Sqoop 1.4.6.2.6.3.0-235
Table DDL in Oracle:
CREATE TABLE ORACLE_CLOB_TABLE (ID NVARCHAR2(10), CLOBCOLUMN CLOB);
Table DDL in Hive:
CREATE TABLE default.hive_clob_table (id string, clobcolumn string) ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat';
insert into default.hive_clob_table (id, clobcolumn) values ("A0001", "This is a very large string text");
insert into default.hive_clob_table (id, clobcolumn) values ("B0002", "This is the second very large string text");
Sqoop Export Command:
CASE 1 – Use Lower Case on Table name and Column list
- The command is not working on HDP 2.6 at all due to Oracle tables being defined as CAPITALIZED table and column names.
- The command is working fine on HDP 2.4, with CLOB column exported accordingly
sqoop \
export \
-Dmapreduce.job.queuename=queue \
--connect 'jdbc:oracle:thin:@//host:port/database_name'\
--username 'user_name' \
--password 'user_password' \
--verbose \
--table 'oracle_clob_table' \
--columns 'id, clobcolumn' \
--hcatalog-table 'hive_clob_table' \
--hcatalog-database 'default' \
--map-column-java clobcolumn=String
Output in Oracle:
A0001 | This is a very large string text |
B0002 | This is the second very large string text |
CASE 2 – Use Upper Case on Table name and Column list
- The command is working fine on both HDP 2.4 and HDP 2.6. However, CLOB column exported as NULL.
sqoop \
export \
-Dmapreduce.job.queuename=queue \
--connect 'jdbc:oracle:thin:@//host:port/database_name'\
--username 'user_name' \
--password 'user_password' \
--verbose \
--table 'ORACLE_CLOB_TABLE' \
--columns 'ID, CLOBCOLUMN' \
--hcatalog-table 'hive_clob_table' \
--hcatalog-database 'default' \
--map-column-java CLOBCOLUMN=String
Output in Oracle:
A0001 | NULL |
B0002 | NULL |
Additional Test Done
I did an additional test by Re-creating Oracle table with small case column names:
Table DDL in Oracle:
CREATE TABLE oracle_clob_table ("id" NVARCHAR2(10), "clobcolumn" CLOB);
sqoop \
export \
-Dmapreduce.job.queuename=queue \
--connect 'jdbc:oracle:thin:@//host:port/database_name'\
--username 'user_name' \
--password 'user_password' \
--verbose \
--table 'oracle_clob_table' \
--columns 'id, clobcolumn' \
--hcatalog-table 'hive_clob_table' \
--hcatalog-database 'default' \
--map-column-java clobcolumn=String
Output in Oracle:
A0001 | This is a very large string text |
B0002 | This is the second very large string text |
I am curious why would the CASE of the column used in sqoop export command would affect the CLOB export ability? Thank you.