Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25652

Wrong datetime conversion between Java and Python

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.3.0
    • None
    • PySpark

    Description

      Hi,
      I found strange behaviour of Spark when using datetime from night of changing date (in CET).

      The data from MySQL is wrongly converted and as a result fold=1 is added.

      Sample code

      MySQL column has DATETIME type and value: "2017-10-29 02:01:44"

      spark.read.jdbc(URL).select("time_column").collect()
      [Row(start_time=datetime.datetime(2017, 10, 29, 2, 1, 44, fold=1))]
      

      As a comparison same query done by sqlalchemy.

      engine = create_engine(URL)
      engine.execute("select time_column from table").fetchone()
      (datetime.datetime(2017, 10, 29, 2, 1, 44),)
      

      I'm using Python 3.6. Both MySQL server and server where I'm doing queries are in CET timezone.

      Attachments

        Activity

          People

            Unassigned Unassigned
            maver1ck Maciej BryƄski
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: