Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17545

Spark SQL Catalyst doesn't handle ISO 8601 date without colon in offset

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.0.1, 2.1.0
    • SQL
    • None

    Description

      When parsing a CSV with a date/time column that contains a variant ISO 8601 that doesn't include a colon in the offset, casting to Timestamp fails.

      Here's a simple, example CSV content.

      time
      "2015-07-20T15:09:23.736-0500"
      "2015-07-20T15:10:51.687-0500"
      "2015-11-21T23:15:01.499-0600"

      Here's the stack trace that results from processing this data.

      16/09/14 15:22:59 ERROR Utils: Aborting task
      java.lang.IllegalArgumentException: 2015-11-21T23:15:01.499-0600
      at org.apache.xerces.jaxp.datatype.XMLGregorianCalendarImpl$Parser.skip(Unknown Source)
      at org.apache.xerces.jaxp.datatype.XMLGregorianCalendarImpl$Parser.parse(Unknown Source)
      at org.apache.xerces.jaxp.datatype.XMLGregorianCalendarImpl.<init>(Unknown Source)
      at org.apache.xerces.jaxp.datatype.DatatypeFactoryImpl.newXMLGregorianCalendar(Unknown Source)
      at javax.xml.bind.DatatypeConverterImpl._parseDateTime(DatatypeConverterImpl.java:422)
      at javax.xml.bind.DatatypeConverterImpl.parseDateTime(DatatypeConverterImpl.java:417)
      at javax.xml.bind.DatatypeConverter.parseDateTime(DatatypeConverter.java:327)
      at org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTime(DateTimeUtils.scala:140)
      at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$.castTo(CSVInferSchema.scala:287)

      Somewhat related, I believe Python standard libraries can produce this form of zone offset. The system I got the data from is written in Python.
      https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior

      Attachments

        Activity

          People

            gurwls223 Hyukjin Kwon
            nbeyer Nathan Beyer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: