Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
I've noticed that the current AvroSerde will happily accept schema that uses string instead of integer for scale and precision, e.g. fragment "precision":"4","scale":"1" from following table:
CREATE TABLE `avro_dec1`( `name` string COMMENT 'from deserializer', `value` decimal(4,1) COMMENT 'from deserializer') COMMENT 'just drop the schema right into the HQL' ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'numFiles'='1', 'avro.schema.literal'='{\"namespace\":\"com.howdy\",\"name\":\"some_schema\",\"type\":\"record\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"value\",\"type\":{\"type\":\"bytes\",\"logicalType\":\"decimal\",\"precision\":\"4\",\"scale\":\"1\"}}]}' );
However the Decimal spec defined in AVRO-1402 requires only integer to be there and hence is allowing only following fragment instead "precision":4,"scale":1 (e.g. no double quotes around numbers).
As Hive can propagate this incorrect schema to new files and hence creating files with invalid schema, I think that we should alter the behavior and insist on the correct schema.
Attachments
Attachments
Issue Links
- relates to
-
HIVE-13251 hive can't read the decimal in AVRO file generated from previous version
- Closed