Sqoop
  1. Sqoop
  2. SQOOP-359

Import fails with Unknown SQL datatype exception

    Details

      Description

      To reproduce this, run an import using a query with number of mappers set to 1 and no boundary query specified. For example:

      $ sqoop import --connect jdbc:mysql://localhost/testdb --username test --password **** \
          --query 'SELECT TDX.A, TDX.B FROM TDX WHERE $CONDITIONS' \
          --target-dir /user/arvind/MYSQL/TDX1 -m 1
      

      This import will fail as follows:

      11/10/06 15:37:59 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-arvind/compile/190f858175a9f99756e503727c931450/QueryResult.jar
      11/10/06 15:37:59 INFO mapreduce.ImportJobBase: Beginning query import.
      11/10/06 15:38:00 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(null), MAX(null) FROM (SELECT TDX.A, TDX.B FROM TDX WHERE  (1 = 1) ) AS t1
      11/10/06 15:38:00 INFO mapred.JobClient: Cleaning up the staging area hdfs://localhost/opt/site/cdh3u1/hadoop/data/tmp/mapred/staging/arvind/.staging/job_201110061528_0004
      11/10/06 15:38:00 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Unknown SQL data type: -3
      	at com.cloudera.sqoop.mapreduce.db.DataDrivenDBInputFormat.getSplits(DataDrivenDBInputFormat.java:211)
      	at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
      	at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
      	at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
      	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
      	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:396)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
      	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
      	at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
      	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
      	at com.cloudera.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:123)
      	at com.cloudera.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:183)
      	at com.cloudera.sqoop.manager.SqlManager.importQuery(SqlManager.java:450)
      	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:384)
      	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
      	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
      	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
      	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
      	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
      
      

      The problem seems to be the bounding value query that is using a null column name for figuring out the datatype.

      1. SQOOP-359-2.patch
        6 kB
        Arvind Prabhakar
      2. SQOOP-359-1.patch
        5 kB
        Arvind Prabhakar

        Activity

        Hide
        Bilung Lee added a comment -

        Patch committed. Thanks, Arvind!

        Show
        Bilung Lee added a comment - Patch committed. Thanks, Arvind!
        Hide
        Hudson added a comment -

        Integrated in Sqoop-jdk-1.6 #36 (See https://builds.apache.org/job/Sqoop-jdk-1.6/36/)
        SQOOP-359 Import fails with Unknown SQL datatype exception

        blee : http://svn.apache.org/viewvc/?view=rev&rev=1180279
        Files :

        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/manager/SqlManager.java
        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java
        • /incubator/sqoop/trunk/src/test/com/cloudera/sqoop/TestBoundaryQuery.java
        Show
        Hudson added a comment - Integrated in Sqoop-jdk-1.6 #36 (See https://builds.apache.org/job/Sqoop-jdk-1.6/36/ ) SQOOP-359 Import fails with Unknown SQL datatype exception blee : http://svn.apache.org/viewvc/?view=rev&rev=1180279 Files : /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/manager/SqlManager.java /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java /incubator/sqoop/trunk/src/test/com/cloudera/sqoop/TestBoundaryQuery.java
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2270/#review2456
        -----------------------------------------------------------

        Ship it!

        • Bilung

        On 2011-10-07 23:04:25, Arvind Prabhakar wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2270/

        -----------------------------------------------------------

        (Updated 2011-10-07 23:04:25)

        Review request for Sqoop and Bilung Lee.

        Summary

        -------

        Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column.

        This addresses bug SQOOP-359.

        https://issues.apache.org/jira/browse/SQOOP-359

        Diffs

        -----

        /src/java/com/cloudera/sqoop/manager/SqlManager.java 1180266

        /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1180266

        /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1180266

        Diff: https://reviews.apache.org/r/2270/diff

        Testing

        -------

        All unit and thirdparty tests.

        Thanks,

        Arvind

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2270/#review2456 ----------------------------------------------------------- Ship it! Bilung On 2011-10-07 23:04:25, Arvind Prabhakar wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2270/ ----------------------------------------------------------- (Updated 2011-10-07 23:04:25) Review request for Sqoop and Bilung Lee. Summary ------- Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column. This addresses bug SQOOP-359 . https://issues.apache.org/jira/browse/SQOOP-359 Diffs ----- /src/java/com/cloudera/sqoop/manager/SqlManager.java 1180266 /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1180266 /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1180266 Diff: https://reviews.apache.org/r/2270/diff Testing ------- All unit and thirdparty tests. Thanks, Arvind
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2270/
        -----------------------------------------------------------

        (Updated 2011-10-07 23:04:25.276941)

        Review request for Sqoop and Bilung Lee.

        Changes
        -------

        Updating patch with review feedback.

        Summary
        -------

        Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column.

        This addresses bug SQOOP-359.
        https://issues.apache.org/jira/browse/SQOOP-359

        Diffs (updated)


        /src/java/com/cloudera/sqoop/manager/SqlManager.java 1180266
        /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1180266
        /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1180266

        Diff: https://reviews.apache.org/r/2270/diff

        Testing
        -------

        All unit and thirdparty tests.

        Thanks,

        Arvind

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2270/ ----------------------------------------------------------- (Updated 2011-10-07 23:04:25.276941) Review request for Sqoop and Bilung Lee. Changes ------- Updating patch with review feedback. Summary ------- Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column. This addresses bug SQOOP-359 . https://issues.apache.org/jira/browse/SQOOP-359 Diffs (updated) /src/java/com/cloudera/sqoop/manager/SqlManager.java 1180266 /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1180266 /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1180266 Diff: https://reviews.apache.org/r/2270/diff Testing ------- All unit and thirdparty tests. Thanks, Arvind
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-10-07 21:01:09, Arvind Prabhakar wrote:

        > /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java, line 177

        > <https://reviews.apache.org/r/2270/diff/1/?file=48568#file48568line177>

        >

        > I could do that but then I will have to do a null check in DataDrivenDBInputFormat. Otherwise, the following code will raise a NPE. Hence I kept it separate. If ou think that is better, I can certainly change that. Please let me know.

        I think I misunderstood your comment. I will update the patch to remove the redundant check. Thanks for pointing that out.

        • Arvind

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2270/#review2442
        -----------------------------------------------------------

        On 2011-10-07 02:31:15, Arvind Prabhakar wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2270/

        -----------------------------------------------------------

        (Updated 2011-10-07 02:31:15)

        Review request for Sqoop and Bilung Lee.

        Summary

        -------

        Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column.

        This addresses bug SQOOP-359.

        https://issues.apache.org/jira/browse/SQOOP-359

        Diffs

        -----

        /src/java/com/cloudera/sqoop/manager/SqlManager.java 1178856

        /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1178856

        /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1178856

        Diff: https://reviews.apache.org/r/2270/diff

        Testing

        -------

        All unit and thirdparty tests.

        Thanks,

        Arvind

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-10-07 21:01:09, Arvind Prabhakar wrote: > /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java, line 177 > < https://reviews.apache.org/r/2270/diff/1/?file=48568#file48568line177 > > > I could do that but then I will have to do a null check in DataDrivenDBInputFormat. Otherwise, the following code will raise a NPE. Hence I kept it separate. If ou think that is better, I can certainly change that. Please let me know. I think I misunderstood your comment. I will update the patch to remove the redundant check. Thanks for pointing that out. Arvind ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2270/#review2442 ----------------------------------------------------------- On 2011-10-07 02:31:15, Arvind Prabhakar wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2270/ ----------------------------------------------------------- (Updated 2011-10-07 02:31:15) Review request for Sqoop and Bilung Lee. Summary ------- Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column. This addresses bug SQOOP-359 . https://issues.apache.org/jira/browse/SQOOP-359 Diffs ----- /src/java/com/cloudera/sqoop/manager/SqlManager.java 1178856 /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1178856 /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1178856 Diff: https://reviews.apache.org/r/2270/diff Testing ------- All unit and thirdparty tests. Thanks, Arvind
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2270/#review2442
        -----------------------------------------------------------

        /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java
        <https://reviews.apache.org/r/2270/#comment5551>

        I could do that but then I will have to do a null check in DataDrivenDBInputFormat. Otherwise, the following code will raise a NPE. Hence I kept it separate. If ou think that is better, I can certainly change that. Please let me know.

        • Arvind

        On 2011-10-07 02:31:15, Arvind Prabhakar wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2270/

        -----------------------------------------------------------

        (Updated 2011-10-07 02:31:15)

        Review request for Sqoop and Bilung Lee.

        Summary

        -------

        Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column.

        This addresses bug SQOOP-359.

        https://issues.apache.org/jira/browse/SQOOP-359

        Diffs

        -----

        /src/java/com/cloudera/sqoop/manager/SqlManager.java 1178856

        /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1178856

        /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1178856

        Diff: https://reviews.apache.org/r/2270/diff

        Testing

        -------

        All unit and thirdparty tests.

        Thanks,

        Arvind

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2270/#review2442 ----------------------------------------------------------- /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java < https://reviews.apache.org/r/2270/#comment5551 > I could do that but then I will have to do a null check in DataDrivenDBInputFormat. Otherwise, the following code will raise a NPE. Hence I kept it separate. If ou think that is better, I can certainly change that. Please let me know. Arvind On 2011-10-07 02:31:15, Arvind Prabhakar wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2270/ ----------------------------------------------------------- (Updated 2011-10-07 02:31:15) Review request for Sqoop and Bilung Lee. Summary ------- Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column. This addresses bug SQOOP-359 . https://issues.apache.org/jira/browse/SQOOP-359 Diffs ----- /src/java/com/cloudera/sqoop/manager/SqlManager.java 1178856 /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1178856 /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1178856 Diff: https://reviews.apache.org/r/2270/diff Testing ------- All unit and thirdparty tests. Thanks, Arvind
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2270/#review2441
        -----------------------------------------------------------

        Thanks for the patch. Look good overall. One suggestion below.

        /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java
        <https://reviews.apache.org/r/2270/#comment5547>

        Would it be better to have this if clause inside the previous if clause? So you won't go through it twice if inputBoundingQuery is already not null.

        • Bilung

        On 2011-10-07 02:31:15, Arvind Prabhakar wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2270/

        -----------------------------------------------------------

        (Updated 2011-10-07 02:31:15)

        Review request for Sqoop and Bilung Lee.

        Summary

        -------

        Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column.

        This addresses bug SQOOP-359.

        https://issues.apache.org/jira/browse/SQOOP-359

        Diffs

        -----

        /src/java/com/cloudera/sqoop/manager/SqlManager.java 1178856

        /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1178856

        /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1178856

        Diff: https://reviews.apache.org/r/2270/diff

        Testing

        -------

        All unit and thirdparty tests.

        Thanks,

        Arvind

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2270/#review2441 ----------------------------------------------------------- Thanks for the patch. Look good overall. One suggestion below. /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java < https://reviews.apache.org/r/2270/#comment5547 > Would it be better to have this if clause inside the previous if clause? So you won't go through it twice if inputBoundingQuery is already not null. Bilung On 2011-10-07 02:31:15, Arvind Prabhakar wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2270/ ----------------------------------------------------------- (Updated 2011-10-07 02:31:15) Review request for Sqoop and Bilung Lee. Summary ------- Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column. This addresses bug SQOOP-359 . https://issues.apache.org/jira/browse/SQOOP-359 Diffs ----- /src/java/com/cloudera/sqoop/manager/SqlManager.java 1178856 /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1178856 /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1178856 Diff: https://reviews.apache.org/r/2270/diff Testing ------- All unit and thirdparty tests. Thanks, Arvind
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2270/
        -----------------------------------------------------------

        Review request for Sqoop and Bilung Lee.

        Summary
        -------

        Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column.

        This addresses bug SQOOP-359.
        https://issues.apache.org/jira/browse/SQOOP-359

        Diffs


        /src/java/com/cloudera/sqoop/manager/SqlManager.java 1178856
        /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1178856
        /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1178856

        Diff: https://reviews.apache.org/r/2270/diff

        Testing
        -------

        All unit and thirdparty tests.

        Thanks,

        Arvind

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2270/ ----------------------------------------------------------- Review request for Sqoop and Bilung Lee. Summary ------- Modified TestBoundaryQuery to introduce a test that reproduces this problem. Also added a validation exception that is raised when the user uses a boundary query without specifying a split-by column. This addresses bug SQOOP-359 . https://issues.apache.org/jira/browse/SQOOP-359 Diffs /src/java/com/cloudera/sqoop/manager/SqlManager.java 1178856 /src/java/com/cloudera/sqoop/mapreduce/DataDrivenImportJob.java 1178856 /src/test/com/cloudera/sqoop/TestBoundaryQuery.java 1178856 Diff: https://reviews.apache.org/r/2270/diff Testing ------- All unit and thirdparty tests. Thanks, Arvind

          People

          • Assignee:
            Arvind Prabhakar
            Reporter:
            Arvind Prabhakar
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development