Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-3288

Incremental import's upper bound ignores session time zone in Oracle

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.7
    • Fix Version/s: 1.5.0
    • Component/s: connectors/oracle
    • Labels:
      None

      Description

      At the moment we're using SELECT SYSDATE FROM dual when getting current time from Oracle.

      SYSDATE returns the underlying operating system's current time, while CURRENT_TIMESTAMP uses the session time zone. This could lead to problems during incremental imports when Oracle's time zone is different from the OS.

      Consider the following scenario when Oracle is configured to +0:00, while the OS is +5:00:

      Oracle time OS time Event
      2:00 7:00 sqoop import --last-value 1:00 ... => imports [1:00, 7:00)
      2:30 7:30 update ... set last_updated = current_timestamp ... => set to 2:30 Won't be imported!
      3:00 8:00 sqoop import --last-value 7:00 ... => imports [7:00, 8:00)

      This way records updated within 5 hours after the last sqoop import won't get imported.

      Please note, that the example above assumes, that the user/administrator who's updating the Oracle table will use the current session time of Oracle when setting the "last updated" column of the table.

      I think the solution is to use CURRENT_TIMESTAMP instead of SYSDATE. Other connection managers, like MySQL or PostgreSQL are using that as well.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dvoros Daniel Voros
                Reporter:
                dvoros Daniel Voros
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: