Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
None
-
None
Description
AvroParquetWriter instantiated in ParquetIO [1] does not specify the data model.
The default is SpecificData model [2], while the AvroParquetReader is reading with a GenericData model [3].
ParquetIO should pass in the correct data model.
[1] https://github.com/apache/beam/blob/v2.28.0/sdks/java/io/parquet/src/main/java/org/apache/beam/sdk/io/parquet/ParquetIO.java#L1052
[2] https://github.com/apache/parquet-mr/blob/master/parquet-avro/src/main/java/org/apache/parquet/avro/AvroParquetWriter.java#L163
[3] https://github.com/apache/beam/blob/v2.28.0/sdks/java/io/parquet/src/main/java/org/apache/beam/sdk/io/parquet/ParquetIO.java#L704