Description
Observed on latest 0.14.x commit, c241170d9bc2cd8415b04e06ecea40ed3d80f64d.
When debug logging is enabled, tests that instantiate a ParquetFileReader fail with:
java.lang.RuntimeException: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.apache.parquet.schema.LogicalTypeAnnotation$StringLogicalTypeAnnotation and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: org.apache.parquet.hadoop.metadata.ParquetMetadata["fileMetaData"]->org.apache.parquet.hadoop.metadata.FileMetaData["schema"]->org.apache.parquet.schema.MessageType["fields"]->java.util.ArrayList[24]->org.apache.parquet.schema.PrimitiveType["logicalTypeAnnotation"]) at org.apache.parquet.hadoop.metadata.ParquetMetadata.toJSON(ParquetMetadata.java:68) at org.apache.parquet.hadoop.metadata.ParquetMetadata.toPrettyJSON(ParquetMetadata.java:48) at org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:1592) at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:629) at org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:902) at org.apache.parquet.hadoop.ParquetFileReader.open(ParquetFileReader.java:698) at org.apache.parquet.hadoop.ColumnIndexValidator.checkContractViolations(ColumnIndexValidator.java:556) at org.apache.parquet.statistics.TestColumnIndexes.testColumnIndexes(TestColumnIndexes.java:348) Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.apache.parquet.schema.LogicalTypeAnnotation$StringLogicalTypeAnnotation and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: org.apache.parquet.hadoop.metadata.ParquetMetadata["fileMetaData"]->org.apache.parquet.hadoop.metadata.FileMetaData["schema"]->org.apache.parquet.schema.MessageType["fields"]->java.util.ArrayList[24]->org.apache.parquet.schema.PrimitiveType["logicalTypeAnnotation"]) at com.fasterxml.jackson.databind.exc.InvalidDefinitionException.from(InvalidDefinitionException.java:77) at com.fasterxml.jackson.databind.SerializerProvider.reportBadDefinition(SerializerProvider.java:1330) at com.fasterxml.jackson.databind.DatabindContext.reportBadDefinition(DatabindContext.java:414) at com.fasterxml.jackson.databind.ser.impl.UnknownSerializer.failForEmpty(UnknownSerializer.java:53) at com.fasterxml.jackson.databind.ser.impl.UnknownSerializer.serialize(UnknownSerializer.java:30) at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:732) at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:770) at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:183)
(note, this seems to be happening when the schema doesn't contain a logical type, which makes me suspect some Jackson configuration to handle null values is needed?)
I also see a few exceptions related to encryption:
2024-05-07 14:37:12 ERROR TestPropertiesDrivenEncryption - ENCRYPT_COLUMNS_AND_FOOTER_CTR - DECRYPT_WITH_KEY_RETRIEVER Error: Didn't expect an exception, but got [com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.apache.parquet.crypto.keytools.FileKeyUnwrapper and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: org.apache.parquet.hadoop.metadata.FileMetaData["fileDecryptor"]->org.apache.parquet.crypto.InternalFileDecryptor["decryptionProperties"]->org.apache.parquet.crypto.FileDecryptionProperties["keyRetriever"])] 14185java.lang.RuntimeException: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.apache.parquet.crypto.keytools.FileKeyUnwrapper and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: org.apache.parquet.hadoop.metadata.FileMetaData["fileDecryptor"]-
To repro, enable debug logging or just comment out `if (LOG.isDebugEnabled())` in ParquetMetadataConverter, as I did here: https://github.com/apache/parquet-mr/compare/master...clairemcginty:parquet-mr:repro-avro-metadata-print-bug?expand=1