Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19342

Datatype tImestamp is converted to numeric in collect method

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.0
    • 2.1.1, 2.2.0
    • SparkR
    • None

    Description

      Get double instead of POSIX in collect method for timestamp column datatype, when NA exists at the top of the column.

      The following codes and outputs show that, how the bug can be reproduced:

      > sparkR.session(master = "local")
      Spark package found in SPARK_HOME: /home/titicaca/spark-2.1
      Launching java with spark-submit command /home/titicaca/spark-2.1/bin/spark-submit   sparkr-shell /tmp/RtmpqmpZUg/backend_port363a898be92 
      Java ref type org.apache.spark.sql.SparkSession id 1 
      > df <- data.frame(col1 = c(0, 1, 2), 
      +                  col2 = c(as.POSIXct("2017-01-01 00:00:01"), NA, as.POSIXct("2017-01-01 12:00:01")))
      > sdf1 <- createDataFrame(df)
      > print(dtypes(sdf1))
      [[1]]
      [1] "col1"   "double"
      
      [[2]]
      [1] "col2"      "timestamp"
      
      > df1 <- collect(sdf1)
      > print(lapply(df1, class))
      $col1
      [1] "numeric"
      
      $col2
      [1] "POSIXct" "POSIXt" 
      
      > sdf2 <- filter(sdf1, "col1 > 0")
      > print(dtypes(sdf2))
      [[1]]
      [1] "col1"   "double"
      
      [[2]]
      [1] "col2"      "timestamp"
      
      > df2 <- collect(sdf2)
      > print(lapply(df2, class))
      $col1
      [1] "numeric"
      
      $col2
      [1] "numeric"
      

      As we can see, the data type of col2 is converted to numberic unexpectedly in the collected local data frame df2

      Attachments

        Activity

          People

            Titicaca Fangzhou Yang
            Titicaca Fangzhou Yang
            Felix Cheung Felix Cheung
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: