Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27450

Timestamp cast fails when the ISO8601 string omits minutes, seconds or milliseconds

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 2.3.0
    • None
    • SQL
    • None
    • Spark 2.3.x

    Description

      ISO8601 allows to omit minutes, seconds and milliseconds.

      hh:mm:ss.sss or hhmmss.sss
      hh:mm:ss or hhmmss
      hh:mm or hhmm
        hh

      Either the seconds, or the minutes and seconds, may be omitted from the basic or extended time formats for greater brevity but decreased accuracy: [hh]:[mm], [hh][mm] and [hh] are the resulting reduced accuracy time formats

      Source: Wikipedia ISO8601

      Popular libs, such as ZonedDateTime, respect that. However, Timestamp cast fails silently.

       

      import org.apache.spark.sql.types._
      val df1 = Seq(("2017-08-01T02:33")).toDF("eventTimeString") // NON-ISO8601 (missing TZ offset) [OK]
      val new_df1 = df1
      .withColumn("eventTimeTS", col("eventTimeString").cast(TimestampType))
      
      new_df1.show(false)
      
      +----------------+-------------------+
      |eventTimeString |eventTimeTS |
      +----------------+-------------------+
      |2017-08-01T02:33|2017-08-01 02:33:00|
      +----------------+-------------------+
      
      val df2 = Seq(("2017-08-01T02:33Z")).toDF("eventTimeString") // ISO8601 [FAIL]
      val new_df2 = df2
      .withColumn("eventTimeTS", col("eventTimeString").cast(TimestampType))
      
      new_df2.show(false)
      
      +-----------------+-----------+
      |eventTimeString |eventTimeTS|
      +-----------------+-----------+
      |2017-08-01T02:33Z|null |
      +-----------------+-----------+
      

       

      val df3 = Seq(("2017-08-01T02:33-03:00")).toDF("eventTimeString") // ISO8601 [FAIL]
      val new_df3 = df3
      .withColumn("eventTimeTS", col("eventTimeString").cast(TimestampType))
      
      new_df3.show(false)
      
      +----------------------+-----------+
      |eventTimeString |eventTimeTS|
      +----------------------+-----------+
      |2017-08-01T02:33-03:00|null |
      +----------------------+-----------+
      

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            leandro.rosa Leandro Rosa
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: