Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.4.4
-
None
-
None
Description
Hi,
I am using Sqoop 1.4.4 to do incremental imports from an Oracle database, checking a timestamp column representing the instant of last update. The timestamp column is in GMT, while the server running Sqoop is in GMT+2, and I am looking for updates every 15 minutes. Due to the different timezones, Sqoop launches queries filtering on timestamp ranges that are in the future, resulting in no rows exported.
Looking at the source code of Sqoop, it seems that to calculate the upper bound (lower bound at next iteration) for the query, the program gets the current time in millisecond (getCurrentDbTimestamp() in org.apache.sqoop.manager.ConnManager) and creates a java.sql.Timestamp object from it. Then, org.apache.sqoop.tool.ImportTool calls datetimeToQueryString(String datetime, int columnType) of org.apache.sqoop.manager.OracleManager passing the timestamp as a string, and get back the sql filter to put in the final query. The problem seems to be that the toString() method of a java.sql.Timestamp object considers the user.timezone of the JVM (Europe/Rome in my case), so I get a String representing a date that is in the future if considered as GMT (as the Oracle database does with the SQL function TO_TIMESTAMP).
To solve this problem, it may be enough to force the conversion from Timestamp to String to work in GMT timezone (like setting user.timezone=GMT). Another solution, when interacting with Oracle, may be to use the TO_TIMESTAMP_TZ function, that allows to specify the timezone of the String representing the timestamp.