Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.2.0
-
None
Description
A Date field can be defined as logical 'date' type in Avro schema like this, the value is stored as int physically, but logically annotated this is a date. The stored value represents days after Unix epoch (1970-01-01):
{"name":"date","type":{"type":"int","logicalType":"date"}}
When AvroReader reads the logical date value, it converts the int back to java.sql.Date at AvroTypeUtil.normalizeValue method. However, if the field is also a nullable (union with null), the conversion is skipped and the original int is returned.
Then the int will be treated as milliseconds from Unix epoch by other RecordWriter such as JsonRecordSetWriter.
Here is a reproducible flow:
input.json
{"date": "2017-05-10", "time": "19:55:34"}
input.json
-> ConvertRecord (JsonTreeReader/AvroRecordSetWriter)
-> ConvertRecord (AvroRecordReader/JsonRecordSetWriter): this avro reader fails to convert with a nullable logical date.
result-not_null
[{"date":"2017-05-10","time":"19:55:34"}]
result-nullable
[{"date":"1970-01-01","time":"19:55:34"}]