Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-3288

Incremental import's upper bound ignores session time zone in Oracle

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.7
    • 1.5.0
    • connectors/oracle
    • None

    Description

      At the moment we're using SELECT SYSDATE FROM dual when getting current time from Oracle.

      SYSDATE returns the underlying operating system's current time, while CURRENT_TIMESTAMP uses the session time zone. This could lead to problems during incremental imports when Oracle's time zone is different from the OS.

      Consider the following scenario when Oracle is configured to +0:00, while the OS is +5:00:

      Oracle time OS time Event
      2:00 7:00 sqoop import --last-value 1:00 ... => imports [1:00, 7:00)
      2:30 7:30 update ... set last_updated = current_timestamp ... => set to 2:30 Won't be imported!
      3:00 8:00 sqoop import --last-value 7:00 ... => imports [7:00, 8:00)

      This way records updated within 5 hours after the last sqoop import won't get imported.

      Please note, that the example above assumes, that the user/administrator who's updating the Oracle table will use the current session time of Oracle when setting the "last updated" column of the table.

      I think the solution is to use CURRENT_TIMESTAMP instead of SYSDATE. Other connection managers, like MySQL or PostgreSQL are using that as well.

      Attachments

        1. SQOOP-3288.1.patch
          0.5 kB
          Daniel Voros

        Issue Links

          Activity

            People

              dvoros Daniel Voros
              dvoros Daniel Voros
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: