Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
The current documentation does not mention that --as-parquetfile will convert all date/timestamp types to Long format (milliseconds since epoch), while also converting to match the session timezone.
This can cause inconsistencies with some cases where data that is inserted in another timezone as the host running the sqoop command differ.
In addition, the current documentation around Oracle says the below:
Oracle also includes the additional date/time types TIMESTAMP WITH TIMEZONE and TIMESTAMP WITH LOCAL TIMEZONE. To support these types, the user’s session timezone must be specified. By default, Sqoop will specify the timezone "GMT" to Oracle. You can override this setting by specifying a Hadoop property oracle.sessionTimeZone on the command-line when running a Sqoop job. For example:
What is not mentioned is that this is only applicable with OraOop (--direct) enabled. This can be also be interpreted that only 'TIMESTAMP WITH TIMEZONE' and 'TIMESTAMP WITH LOCAL TIMEZONE' will be affected, not the entire session will have the GMT timezone