Sqoop
  1. Sqoop
  2. SQOOP-341

Sqoop doesn't handle unsigned ints at least with MySQL

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0-incubating
    • Component/s: None
    • Labels:
      None

      Description

      mysql> describe Inventory;
      -----------------------------------------------------------+

      Field Type Null Key Default Extra

      -----------------------------------------------------------+

      ProductID smallint(4) unsigned zerofill YES   NULL  
      Quantity int(10) unsigned YES   NULL  

      -----------------------------------------------------------+
      2 rows in set (0.00 sec)

      mysql> insert into Inventory Values (9999,9999999999);
      Query OK, 1 row affected, 1 warning (0.00 sec)

      mysql> select * from Inventory;
      ---------------------+

      ProductID Quantity

      ---------------------+

      9999 4294967295

      ---------------------+

      Get's the following error

      ava.io.IOException: SQLException in nextKeyValue
      at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:248)
      at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456)
      at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
      at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:188)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
      at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
      Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLDataException: '4.294967295E9' in column '2' is outside valid range for the datatype INTEGER.
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
      at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
      at com.mysql.jdbc.Util.getInstance(Util.java:382)
      at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1025)
      at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
      at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
      at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
      at com.mysql.jdbc.ResultSetImpl.throwRangeException(ResultSetImpl.java:7970)
      at com.mysql.jdbc.ResultSetImpl.parseIntAsDouble(ResultSetImpl.java:7199)
      at com.mysql.jdbc.ResultSetImpl.getInt(ResultSetImpl.java:2702)
      at org.apache.sqoop.lib.JdbcWritableBridge.readInteger(JdbcWritableBridge.java:51)
      at Inventory.readFields(Inventory.java:75)
      at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:244)
      ... 7 more

      This seems to be caused by sqoop having no notion of unsigned values when importing from MySQL(or maybe in general)

        Activity

        Hide
        Alex Newman added a comment -

        I'm guessing since hive has no notion of a unsigned type, we should just use a bigint for unsigned types. We may also have to do some work to actually get mysql to tell us that it's an unsigned int, as signed/unsigned is a foreign concept in jdbc.

        Show
        Alex Newman added a comment - I'm guessing since hive has no notion of a unsigned type, we should just use a bigint for unsigned types. We may also have to do some work to actually get mysql to tell us that it's an unsigned int, as signed/unsigned is a foreign concept in jdbc.
        Hide
        Arvind Prabhakar added a comment -

        This happens because the MySQL JDBC driver returns the type java.sql.Types.INTEGER which is used by Sqoop to generate the field mapping. In case of unsigned integer, the mapping should instead be to long (see this page for reference).

        It may be possible to do this by refactoring the method SqlManager.getColumnTypesForRawQuery(String stmt) so that the actual typeId is resolved by a method call instead of metadata.getColumnType. The MySQLManager can then override this method and use the ResultSetMetaData.getColumnClassName() and appropriately return java.sql.Types.BIGINT instead.

        I have not spent a lot of time looking at it, so there could be other code paths that may need to be covered.

        Show
        Arvind Prabhakar added a comment - This happens because the MySQL JDBC driver returns the type java.sql.Types.INTEGER which is used by Sqoop to generate the field mapping. In case of unsigned integer, the mapping should instead be to long (see this page for reference). It may be possible to do this by refactoring the method SqlManager.getColumnTypesForRawQuery(String stmt) so that the actual typeId is resolved by a method call instead of metadata.getColumnType . The MySQLManager can then override this method and use the ResultSetMetaData.getColumnClassName() and appropriately return java.sql.Types.BIGINT instead. I have not spent a lot of time looking at it, so there could be other code paths that may need to be covered.
        Hide
        Alex Newman added a comment -

        I pushed a patch to fix this. I'm not sure if we want to test this in unit tests. Or how we would.

        Show
        Alex Newman added a comment - I pushed a patch to fix this. I'm not sure if we want to test this in unit tests. Or how we would.
        Hide
        Arvind Prabhakar added a comment -

        This patch covers the SqlManager case. For consistency, we should also update the DataDrivenDBInputFormat so that the lookup for splitter uses the same type information as what is used in mapping. Other than that, the patch looks good.

        Show
        Arvind Prabhakar added a comment - This patch covers the SqlManager case. For consistency, we should also update the DataDrivenDBInputFormat so that the lookup for splitter uses the same type information as what is used in mapping. Other than that, the patch looks good.
        Hide
        Alex Newman added a comment -

        Updated! Cheers Arvind. I am also open minded to adding some testing here.

        Show
        Alex Newman added a comment - Updated! Cheers Arvind. I am also open minded to adding some testing here.
        Hide
        Arvind Prabhakar added a comment -

        Thanks Alex. I would suggest adding a test case that you have described in the description to the MySQL manger tests.

        Show
        Arvind Prabhakar added a comment - Thanks Alex. I would suggest adding a test case that you have described in the description to the MySQL manger tests.
        Hide
        Alex Newman added a comment -

        I ran ant checkstyle and fixed this one.

        Show
        Alex Newman added a comment - I ran ant checkstyle and fixed this one.
        Hide
        Arvind Prabhakar added a comment -

        +1

        Show
        Arvind Prabhakar added a comment - +1
        Hide
        Arvind Prabhakar added a comment -

        Patch committed. Thanks Alex!

        Show
        Arvind Prabhakar added a comment - Patch committed. Thanks Alex!
        Hide
        Hudson added a comment -

        Integrated in Sqoop-jdk-1.6 #34 (See https://builds.apache.org/job/Sqoop-jdk-1.6/34/)
        SQOOP-341. Support for unsigned integers with MySQL.

        (Alex Newman via Arvind Prabhakar)

        arvind : http://svn.apache.org/viewvc/?view=rev&rev=1178661
        Files :

        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/manager/SqlManager.java
        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/db/DataDrivenDBInputFormat.java
        • /incubator/sqoop/trunk/src/test/com/cloudera/sqoop/manager/DirectMySQLTest.java
        Show
        Hudson added a comment - Integrated in Sqoop-jdk-1.6 #34 (See https://builds.apache.org/job/Sqoop-jdk-1.6/34/ ) SQOOP-341 . Support for unsigned integers with MySQL. (Alex Newman via Arvind Prabhakar) arvind : http://svn.apache.org/viewvc/?view=rev&rev=1178661 Files : /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/manager/SqlManager.java /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/db/DataDrivenDBInputFormat.java /incubator/sqoop/trunk/src/test/com/cloudera/sqoop/manager/DirectMySQLTest.java

          People

          • Assignee:
            Alex Newman
            Reporter:
            Alex Newman
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development