Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.4.7
-
None
Description
At the moment we're using SELECT SYSDATE FROM dual when getting current time from Oracle.
SYSDATE returns the underlying operating system's current time, while CURRENT_TIMESTAMP uses the session time zone. This could lead to problems during incremental imports when Oracle's time zone is different from the OS.
Consider the following scenario when Oracle is configured to +0:00, while the OS is +5:00:
Oracle time | OS time | Event |
---|---|---|
2:00 | 7:00 | sqoop import --last-value 1:00 ... => imports [1:00, 7:00) |
2:30 | 7:30 | update ... set last_updated = current_timestamp ... => set to 2:30 Won't be imported! |
3:00 | 8:00 | sqoop import --last-value 7:00 ... => imports [7:00, 8:00) |
This way records updated within 5 hours after the last sqoop import won't get imported.
Please note, that the example above assumes, that the user/administrator who's updating the Oracle table will use the current session time of Oracle when setting the "last updated" column of the table.
I think the solution is to use CURRENT_TIMESTAMP instead of SYSDATE. Other connection managers, like MySQL or PostgreSQL are using that as well.
Attachments
Attachments
Issue Links
- links to