Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17039

cannot read null dates from csv file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.0.0
    • None
    • Input/Output
    • None

    Description

      I see this exact same bug as reported in this stack overflow post using Spark 2.0.0 (released version).
      In scala, I read a csv using

      sqlContext.read
      .format("csv")
      .option("header", "false")
      .option("inferSchema", "false")
      .option("nullValue", "?")
      .option("dateFormat", "yyyy-MM-dd'T'HH:mm:ss")
      .schema(dfSchema)
      .csv(dataFile)
      The data contains some null dates (represented with ?).
      The error I get is:

      org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 8.0 failed 1 times, most recent failure: Lost task 0.0 in stage 8.0 (TID 10, localhost): java.text.ParseException: Unparseable date: "?"
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              barrybecker4 Barry Becker
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: