Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16981

For CSV files nullValue is not respected for Date/Time data type

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Cannot Reproduce
    • 2.0.0
    • None
    • None
    • None

    Description

      Test case
      val struct = StructType(Seq(StructField("col1", StringType, true),StructField("col2", TimestampType, true), Seq(StructField("col3", StringType, true)))

      val cq = sqlContext.readStream
      .format("csv")
      .option("nullValue", " ")
      .schema(struct)
      .load(s"somepath")
      .writeStream(....)

      content of the file
      "abc", ,"def"

      Result:
      Exception is thrown:
      scala.MatchError: java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff] (of class java.lang.IllegalArgumentException)

      Code analysis:
      Problem is caused by code in castTo method of CSVTypeCast object
      For all data types except temporal there is the following check:
      if (datum == options.nullValue && nullable) {
      null
      }
      But for temporal types it is missing

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lev.numerify Lev
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: