Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
1.6.0
Description
Let’s have following json
val rdd = sc.parallelize("""{"ts":1452386229}""" :: Nil)
Spark sql casts int to timestamp treating int value as a number of seconds.
https://issues.apache.org/jira/browse/SPARK-11724
scala> sqlContext.read.json(rdd).select($"ts".cast(TimestampType)).show +--------------------+ | ts| +--------------------+ |2016-01-10 01:37:...| +--------------------+
However parsing json with schema gives different result
scala> val schema = (new StructType).add("ts", TimestampType) schema: org.apache.spark.sql.types.StructType = StructType(StructField(ts,TimestampType,true)) scala> sqlContext.read.schema(schema).json(rdd).show +--------------------+ | ts| +--------------------+ |1970-01-17 20:26:...| +--------------------+