Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33632

to_date doesn't behave as documented

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 3.0.1
    • Fix Version/s: None
    • Component/s: Documentation, Spark Core
    • Labels:
      None

      Description

      I'm trying to use to_date on a string formatted as "10/31/20".
      Expected output is "2020-10-31".
      Actual output is "0020-01-31".

      The documentation suggests 2020 or 20 as input for "y".

      Example below. Expected behaviour is included in the udf.

      import java.sql.Date
      
      import org.apache.spark.sql.SparkSession
      import org.apache.spark.sql.functions.{to_date, udf}
      
      object ToDate {
        val toDate = udf((date: String) => {
          val split = date.split("/")
          val month = "%02d".format(split(0).toInt)
          val day = "%02d".format(split(1).toInt)
          val year = split(2).toInt + 2000
      
          Date.valueOf(s"${year}-${month}-${day}")
        })
      
        def main(args: Array[String]): Unit = {
          val spark = SparkSession.builder().master("local[2]").getOrCreate()
          spark.sparkContext.setLogLevel("ERROR")
          import spark.implicits._
      
          Seq("1/1/20", "10/31/20")
            .toDF("raw")
            .withColumn("to_date", to_date($"raw", "m/d/y"))
            .withColumn("udf", toDate($"raw"))
            .show
        }
      }
      
      

        Attachments

        1. image-2020-12-04-11-45-10-379.png
          65 kB
          Liu Neng
        2. screenshot-1.png
          5 kB
          Frank Oosterhuis

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              frankivo Frank Oosterhuis
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: