Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-3210

Sqoop1 (import + --hive-import + --map-column-hive) using column aliases fails with error (Import failed: No column by the name)

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • $ sqoop version
      17/07/17 12:28:51 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.10.1

    Description

      Sqoop1 (import + --hive-import + --map-column-hive) using column aliases fails with error (Import failed: No column by the name), see test case below.

      Please enable the ability to use column aliases.

      Test Case

      #################
      # STEP 01 - Create Table and Data
      #################
      
      [example]
      export MYCONN=jdbc:mysql://host.domain.com:3306/db1
      export MYUSER=sqoop
      export MYPSWD=sqoop
      
      sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "drop table t1"
      sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "create table t1 (c1 int, c2 date, c3 varchar(10))"
      sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into t1 values (1, current_date, 'some data')"
      sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select * from t1"
      
      Output:
      -----------------------------------------
      | c1          | c2         | c3         | 
      -----------------------------------------
      | 1           | 2017-07-17 | some data  | 
      -----------------------------------------
      
      #################
      # STEP 02 - Verify Hive Import (baseline)
      #################
      
      beeline -u jdbc:hive2:// -e "use db1; drop table t1 purge;"
      sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select * from t1 where \$CONDITIONS" --num-mappers 1 --target-dir /data/dbs/db1/t1 --delete-target-dir --fields-terminated-by ',' --lines-terminated-by '\n' --as-textfile --hive-import --hive-database db1 --hive-table t1 
      beeline -u jdbc:hive2:// -e "use db1; desc t1; select * from t1"
      
      Output:
      +-----------+------------+----------+--+
      | col_name  | data_type  | comment  |
      +-----------+------------+----------+--+
      | c1        | int        |          |
      | c2        | string     |          |
      | c3        | string     |          |
      +-----------+------------+----------+--+
      +--------+-------------+------------+--+
      | t1.c1  |    t1.c2    |   t1.c3    |
      +--------+-------------+------------+--+
      | 1      | 2017-07-17  | some data  |
      +--------+-------------+------------+--+
      
      #################
      # STEP 03 - Verify Hive Import with (--map-column-hive)
      #################
      
      beeline -u jdbc:hive2:// -e "use db1; drop table t1 purge;"
      sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select * from t1 where \$CONDITIONS" --num-mappers 1 --target-dir /data/dbs/db1/t1 --delete-target-dir --fields-terminated-by ',' --lines-terminated-by '\n' --as-textfile --hive-import --hive-database db1 --hive-table t1 --map-column-hive "c2=date,c3=varchar(10)"  
      beeline -u jdbc:hive2:// -e "use db1; desc t1; select * from t1"
      
      +-----------+--------------+----------+--+
      | col_name  |  data_type   | comment  |
      +-----------+--------------+----------+--+
      | c1        | int          |          |
      | c2        | date         |          |
      | c3        | varchar(10)  |          |
      +-----------+--------------+----------+--+
      +--------+-------------+------------+--+
      | t1.c1  |    t1.c2    |   t1.c3    |
      +--------+-------------+------------+--+
      | 1      | 2017-07-17  | some data  |
      +--------+-------------+------------+--+
      
      #################
      # STEP 04 - Verify Hive Import with (--map-column-hive) and (database.table) notation
      #################
      
      beeline -u jdbc:hive2:// -e "use db1; drop table t1 purge;"
      sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select * from t1 where \$CONDITIONS" --num-mappers 1 --target-dir /data/dbs/db1/t1 --delete-target-dir --fields-terminated-by ',' --lines-terminated-by '\n' --as-textfile --hive-import --hive-database db1 --hive-table t1 --map-column-hive "db1.c2=date,db1.c3=varchar(10)"  
      beeline -u jdbc:hive2:// -e "use db1; desc t1; select * from t1"
      
      Output:
      17/07/17 12:10:47 INFO mapreduce.ImportJobBase: Transferred 23 bytes in 20.602 seconds (1.1164 bytes/sec)
      17/07/17 12:10:47 INFO mapreduce.ImportJobBase: Retrieved 1 records.
      17/07/17 12:10:47 INFO manager.SqlManager: Executing SQL statement: select * from t1 where  (1 = 0) 
      17/07/17 12:10:47 INFO manager.SqlManager: Executing SQL statement: select * from t1 where  (1 = 0) 
      17/07/17 12:10:47 ERROR tool.ImportTool: Import failed: No column by the name db1.c3found while importing data
      
      #################
      # STEP 05 - Verify Hive Import with (--map-column-hive) and (column alias) notation
      #################
      
      beeline -u jdbc:hive2:// -e "use db1; drop table t1 purge;"
      sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select a.c1, a.c2, a.c3 from t1 a where \$CONDITIONS" --num-mappers 1 --target-dir /data/dbs/db1/t1 --delete-target-dir --fields-terminated-by ',' --lines-terminated-by '\n' --as-textfile --hive-import --hive-database db1 --hive-table t1 --map-column-hive "a.c2=date,a.c3=varchar(10)"  
      beeline -u jdbc:hive2:// -e "use db1; desc t1; select * from t1"
      
      Output:
      17/07/17 12:13:12 INFO mapreduce.ImportJobBase: Transferred 23 bytes in 20.7211 seconds (1.11 bytes/sec)
      17/07/17 12:13:12 INFO mapreduce.ImportJobBase: Retrieved 1 records.
      17/07/17 12:13:12 INFO manager.SqlManager: Executing SQL statement: select a.c1, a.c2, a.c3 from t1 a where  (1 = 0) 
      17/07/17 12:13:12 INFO manager.SqlManager: Executing SQL statement: select a.c1, a.c2, a.c3 from t1 a where  (1 = 0) 
      17/07/17 12:13:12 ERROR tool.ImportTool: Import failed: No column by the name a.c3found while importing data
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            markuskemper@me.com Markus Kemper
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: