Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20166

Use XXX for ISO timezone instead of ZZ which is FastDateFormat specific in CSV/JSON time related options

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • SQL
    • None

    Description

      We can use XXX format instead of ZZ. ZZ seems a FastDateFormat specific Please see https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#iso8601timezone and https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/time/FastDateFormat.html

      ZZ supports "ISO 8601 extended format time zones" but it seems FastDateFormat specific option.

      It seems we better replace ZZ to XXX because they look use the same strategy - https://github.com/apache/commons-lang/blob/8767cd4f1a6af07093c1e6c422dae8e574be7e5e/src/main/java/org/apache/commons/lang3/time/FastDateParser.java#L930.

      I also checked the codes and manually debugged it for sure. It seems both cases use the same pattern

      ( Z|(?:[+-]\\d{2}(?::)\\d{2})) 

      .

      Note that this is a fix about documentation not the behaviour change because ZZ seems invalid date format in SimpleDateFormat as documented in DataFrameReader:

      • <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSZZ`): sets the string that
      • indicates a timestamp format. Custom date formats follow the formats at
      • `java.text.SimpleDateFormat`. This applies to timestamp type.</li>
      scala> new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00")
      res4: java.util.Date = Tue Mar 21 20:00:00 KST 2017
      
      scala>  new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z")
      res10: java.util.Date = Tue Mar 21 09:00:00 KST 2017
      
      scala> new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00")
      java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000-11:00"
        at java.text.DateFormat.parse(DateFormat.java:366)
        ... 48 elided
      scala>  new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z")
      java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000Z"
        at java.text.DateFormat.parse(DateFormat.java:366)
        ... 48 elided
      
      scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00")
      res7: java.util.Date = Tue Mar 21 20:00:00 KST 2017
      
      scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z")
      res1: java.util.Date = Tue Mar 21 09:00:00 KST 2017
      
      scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00")
      res8: java.util.Date = Tue Mar 21 20:00:00 KST 2017
      
      scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z")
      res2: java.util.Date = Tue Mar 21 09:00:00 KST 2017
      

      Attachments

        Activity

          People

            gurwls223 Hyukjin Kwon
            gurwls223 Hyukjin Kwon
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: