Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-45433

CSV/JSON schema inference when timestamps do not match specified timestampFormat with only one row on each partition report error

    XMLWordPrintableJSON

Details

    Description

      CSV/JSON schema inference when timestamps do not match specified timestampFormat with `only one row on each partition` report error.

      //eg
      val csv = spark.read.option("timestampFormat", "yyyy-MM-dd'T'HH:mm:ss")
        .option("inferSchema", true).csv(Seq("2884-06-24T02:45:51.138").toDS())
      csv.show() 
      //error
      Caused by: java.time.format.DateTimeParseException: Text '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index 19 

      This bug affect 3.3/3.4/3.5. Unlike https://issues.apache.org/jira/browse/SPARK-45424 , this is a different bug but has the same error message

      Attachments

        Activity

          People

            fanjia Jia Fan
            fanjia Jia Fan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: