[SPARK-25652] Wrong datetime conversion between Java and Python - ASF JIRA

XML

Word

Printable

JSON

Hi,
I found strange behaviour of Spark when using datetime from night of changing date (in CET).

The data from MySQL is wrongly converted and as a result fold=1 is added.

Sample code

MySQL column has DATETIME type and value: "2017-10-29 02:01:44"

spark.read.jdbc(URL).select("time_column").collect()
[Row(start_time=datetime.datetime(2017, 10, 29, 2, 1, 44, fold=1))]

As a comparison same query done by sqlalchemy.

engine = create_engine(URL)
engine.execute("select time_column from table").fetchone()
(datetime.datetime(2017, 10, 29, 2, 1, 44),)

I'm using Python 3.6. Both MySQL server and server where I'm doing queries are in CET timezone.