Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-2745

Using datetime column as a splitter for Oracle no longer works

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.6
    • 1.4.7
    • None
    • None

    Description

      I was recently looking into case when using Oracle connector to import data split by datetime column (Date, Time or Timestamp) does not work and fails with error similar to the following:

      2015-12-15 23:03:41,902 INFO [main] org.apache.sqoop.mapreduce.db.DBInputFormat: Using read commited transaction isolation
      2015-12-15 23:03:42,089 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: C3_TIMESTAMP >= '2015-12-12 19:21:50.0' AND C3_TIMESTAMP < '2029-08-20 08:21:58.0'
      2015-12-15 23:03:42,238 INFO [main] org.apache.sqoop.mapreduce.db.OracleDBRecordReader: Time zone has been set to GMT
      2015-12-15 23:03:42,274 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Working on split: C3_TIMESTAMP >= '2015-12-12 19:21:50.0' AND C3_TIMESTAMP < '2029-08-20 08:21:58.0'
      2015-12-15 23:03:42,343 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: SELECT C1_INT, C2_DATE, C3_TIMESTAMP FROM V1_ORACLE_DATE_AND_TIMESTAMP WHERE ( C3_TIMESTAMP >= '2015-12-12 19:21:50.0' ) AND ( C3_TIMESTAMP < '2029-08-20 08:21:58.0' )
      2015-12-15 23:03:42,394 ERROR [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
      java.sql.SQLDataException: ORA-01843: not a valid month
      
      	at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
      	at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
      	at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:879)
      	at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:450)
      	at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192)
      	at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
      	at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:207)
      	at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:884)
      	at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1167)
      	at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1289)
      	at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3584)
      	at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3628)
      	at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1493)
      	at org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
      	at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
      	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
      	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
      	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
      	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
      	at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
      	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
      	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
      	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
      2015-12-15 23:03:42,421 INFO [Thread-12] org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false

      I was looking into the problem and I found the root cause. Oracle connector uses custom OracleDataDrivenDBInputFormat that overrides getSplitter method from parent DataDrivenDBInputFormat class. This custom splitter is essential because it ensures that we're correctly using datetime constants in generated queries. However in SQOOP-2334 we've changed the method getSplitter(int) to getSplitter(int, long) without changing the oracle connector that now overrides unused method.

      Attachments

        1. SQOOP-2745.patch
          7 kB
          Jarek Jarcec Cecho

        Issue Links

          Activity

            People

              jarcec Jarek Jarcec Cecho
              jarcec Jarek Jarcec Cecho
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: