Description
Write dates/timestamps to Avro file in Spark 2.4.5:
$ export TZ="America/Los_Angeles"
$ bin/spark-shell --packages org.apache.spark:spark-avro_2.11:2.4.5
scala> df.write.format("avro").save("/Users/maxim/tmp/before_1582/2_4_5_ts_avro") scala> spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_ts_avro").show(false) +----------+ |date | +----------+ |1001-01-01| +----------+ scala> df2.write.format("avro").save("/Users/maxim/tmp/before_1582/2_4_5_ts_avro") scala> spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_ts_avro").show(false) +--------------------------+ |ts | +--------------------------+ |1001-01-01 01:02:03.123456| +--------------------------+
Spark 3.0.0-preview2 ( and 3.1.0-SNAPSHOT) outputs different values from Spark 2.4.5:
$ export TZ="America/Los_Angeles"
$ /bin/spark-shell --packages org.apache.spark:spark-avro_2.12:2.4.5
scala> spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles") scala> spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_date_avro").show(false) +----------+ |date | +----------+ |1001-01-07| +----------+ scala> spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_ts_avro").show(false) +--------------------------+ |ts | +--------------------------+ |1001-01-07 01:09:05.123456| +--------------------------+
Attachments
Issue Links
- is related to
-
SPARK-31318 Split Parquet/Avro configs for rebasing dates/timestamps in read and in write
- Resolved
- links to