Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-3280

Sqoop import with time data type from mysql is not working as expected.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Hi Team,

      Hope you are doing good !!!

      ##########################
      Problem Statement
      ##########################

      Sqoop import with time data type from mysql is not working as expected.

      ##########################
      Detailed Problem Statement
      ##########################

      We are trying to import the time datatype from Mysql via Sqoop and it is not working as expected and as mentioned in the Mysql document the value time data type can import.

      If we set the time(hour) more than 24 then it doesn't work fine but if we set the hour less than 24 then it imports well.

      Now if we see the mysql document(https://dev.mysql.com/doc/refman/5.7/en/time.html) for Time data range which is '-838:59:59' to '838:59:59' but Sqoop is not working as per this range set.

      Note:- I am creating 2 scenarios (working and non-working) to give more details on this with replication steps that will help you to replicate this in house.

       

      ##########################
      Replication Steps ----> Working Scenario
      ##########################

      Step 1:- Create table in Mysql.
      ----------------------------------
      mysql> create table repro_time( timevalue time);
      Query OK, 0 rows affected (0.08 sec)

      mysql> insert into repro_time values('24:24:24');
      Query OK, 1 row affected (0.06 sec)

      mysql> select * from repro_time;
      -----------

      timevalue

      -----------

      24:24:24

      -----------
      1 row in set (0.01 sec)

      Step 2:- Sqoop import into HDFS
      ---------------------------------
      [root@host-10-17-101-232 ~]# export MYCONN=jdbc:mysql://host-10-17-101-231.coe.cloudera.com/test
      [root@host-10-17-101-232 ~]# export MYUSER=*****
      [root@host-10-17-101-232 ~]# export MYPSWD=*****

      [root@host-10-17-101-232 ~]# sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table repro_time --target-dir '/user/root/repro_time' --delete-target-dir -m 1

      Bytes Read=0
      File Output Format Counters
      Bytes Written=9
      18/01/22 21:41:57 INFO mapreduce.ImportJobBase: Transferred 9 bytes in 17.5695 seconds (0.5123 bytes/sec)
      18/01/22 21:41:57 INFO mapreduce.ImportJobBase: Retrieved 1 records.

      [root@host-10-17-101-232 ~]# hadoop fs -cat repro_time/p*
      00:24:24

      Note:- We set the hour as 24 so that's why it has sett 00 over here which is normal behaviour.

      ##########################
      Replication Steps ----> Non-Working Scenario
      ##########################

      Step1:- Create table in Mysql
      ------------------------------

      mysql> create table repro_time( timevalue time);
      Query OK, 0 rows affected (0.08 sec)

      mysql> insert into repro_time values('24:24:24');
      Query OK, 1 row affected (0.06 sec)

      mysql> select * from repro_time;
      -----------

      timevalue

      -----------

      24:24:24

      -----------
      1 row in set (0.01 sec)

      mysql> insert into repro_time values('25:24:24');
      Query OK, 1 row affected (0.02 sec)

      mysql> select * from repro_time;
      -----------

      timevalue

      -----------

      24:24:24
      25:24:24

      -----------
      2 rows in set (0.00 sec)

      Note:- Here you can see I have inserted a second value of hour as 25 which is more than 24 and it has inserted successfully because of the time range for Mysql is -838:59:59' to '838:59:59'.

      Step 2:- Sqoop import into HDFS
      ---------------------------------
      [root@host-10-17-101-232 ~]# export MYCONN=jdbc:mysql://host-10-17-101-231.coe.cloudera.com/test
      [root@host-10-17-101-232 ~]# export MYUSER=****
      [root@host-10-17-101-232 ~]# export MYPSWD=****

      [root@host-10-17-101-232 ~]# sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table repro_time --target-dir '/user/root/repro_time1' --delete-target-dir -m 1

      18/01/22 21:42:34 INFO mapreduce.Job: Job job_1516093492107_2868 running in uber mode : false
      18/01/22 21:42:34 INFO mapreduce.Job: map 0% reduce 0%
      18/01/22 21:42:40 INFO mapreduce.Job: Task Id : attempt_1516093492107_2868_m_000000_0, Status : FAILED
      Error: java.io.IOException: SQLException in nextKeyValue
      at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
      at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
      at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
      at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
      at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
      Caused by: java.sql.SQLException: Illegal hour value '25' for java.sql.Time type in value '25:24:24.
      at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965)
      at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:898)
      at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:887)
      at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:861)
      at com.mysql.jdbc.TimeUtil.fastTimeCreate(TimeUtil.java:286)
      at com.mysql.jdbc.ResultSetImpl.fastTimeCreate(ResultSetImpl.java:979)
      at com.mysql.jdbc.ResultSetRow.getTimeFast(ResultSetRow.java:884)
      at com.mysql.jdbc.BufferRow.getTimeFast(BufferRow.java:543)
      at com.mysql.jdbc.ResultSetImpl.getTimeInternal(ResultSetImpl.java:5575)
      at com.mysql.jdbc.ResultSetImpl.getTime(ResultSetImpl.java:5343)
      at org.apache.sqoop.lib.JdbcWritableBridge.readTime(JdbcWritableBridge.java:106)
      at com.cloudera.sqoop.lib.JdbcWritableBridge.readTime(JdbcWritableBridge.java:78)
      at repro_time.readFields(repro_time.java:90)
      at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:244)

      Note:- Here according to Sqoop it is an invalid hour 25 and because of that it has thrown the error.

      I will be attaching the full Sqoop output(Sqoop_time_mysql_output.txt) and Yarn job bundle(212c5296-1ef9-4081-a65a-0fde003b341f.default.20180123-06-00-37.support-bundle.zip) for your reference that will help you to get more details.

      #########################
      Ask From the Team
      ########################

      This looks like a bug to me but I would be needing your help to fix this bug so that Sqoop can take the time which is a range for mysql.

      Please let me know if you have any questions.

      Regards
      Nitish Khanna

      Attachments

        Activity

          People

            Unassigned Unassigned
            nitish.khanna Niitsh Khanna
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: