Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-3254

sqoop export Hive ORC to Oracle CLOB columns are somewhat case-sensitive

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 1.4.6
    • None
    • None
    • HDP 2.4 and HDP 2.6

    Description

      I am trying to export a String column from Hive ORC table into Oracle CLOB column using HCatalog in Sqoop 1.4.6 (both HDP 2.4 and HDP 2.6).

      • HDP 2.4: Sqoop 1.4.6.2.4.3.0-227
      • HDP 2.6: Sqoop 1.4.6.2.6.3.0-235

      Table DDL in Oracle:
      CREATE TABLE ORACLE_CLOB_TABLE (ID NVARCHAR2(10), CLOBCOLUMN CLOB);

      Table DDL in Hive:
      CREATE TABLE default.hive_clob_table (id string, clobcolumn string) ROW FORMAT SERDE
      'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
      STORED AS INPUTFORMAT
      'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
      OUTPUTFORMAT
      'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat';

      insert into default.hive_clob_table (id, clobcolumn) values ("A0001", "This is a very large string text");
      insert into default.hive_clob_table (id, clobcolumn) values ("B0002", "This is the second very large string text");

      Sqoop Export Command:

      CASE 1 – Use Lower Case on Table name and Column list

      • The command is not working on HDP 2.6 at all due to Oracle tables being defined as CAPITALIZED table and column names.
      • The command is working fine on HDP 2.4, with CLOB column exported accordingly

      sqoop \
      export \
      -Dmapreduce.job.queuename=queue \
      --connect 'jdbc:oracle:thin:@//host:port/database_name'\
      --username 'user_name' \
      --password 'user_password' \
      --verbose \
      --table 'oracle_clob_table' \
      --columns 'id, clobcolumn' \
      --hcatalog-table 'hive_clob_table' \
      --hcatalog-database 'default' \
      --map-column-java clobcolumn=String

      Output in Oracle:

      A0001 This is a very large string text
      B0002 This is the second very large string text

      CASE 2 – Use Upper Case on Table name and Column list

      • The command is working fine on both HDP 2.4 and HDP 2.6. However, CLOB column exported as NULL.

      sqoop \
      export \
      -Dmapreduce.job.queuename=queue \
      --connect 'jdbc:oracle:thin:@//host:port/database_name'\
      --username 'user_name' \
      --password 'user_password' \
      --verbose \
      --table 'ORACLE_CLOB_TABLE' \
      --columns 'ID, CLOBCOLUMN' \
      --hcatalog-table 'hive_clob_table' \
      --hcatalog-database 'default' \
      --map-column-java CLOBCOLUMN=String

      Output in Oracle:

      A0001 NULL
      B0002 NULL

      Additional Test Done
      I did an additional test by Re-creating Oracle table with small case column names:

      Table DDL in Oracle:
      CREATE TABLE oracle_clob_table ("id" NVARCHAR2(10), "clobcolumn" CLOB);

      sqoop \
      export \
      -Dmapreduce.job.queuename=queue \
      --connect 'jdbc:oracle:thin:@//host:port/database_name'\
      --username 'user_name' \
      --password 'user_password' \
      --verbose \
      --table 'oracle_clob_table' \
      --columns 'id, clobcolumn' \
      --hcatalog-table 'hive_clob_table' \
      --hcatalog-database 'default' \
      --map-column-java clobcolumn=String

      Output in Oracle:

      A0001 This is a very large string text
      B0002 This is the second very large string text

      I am curious why would the CASE of the column used in sqoop export command would affect the CLOB export ability? Thank you.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Suryani Suryani Simon Turtan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: