Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-2468

ParquetMetadata.toPrettyJSON throws exception on file read when LOG.isDebugEnabled()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.14.0
    • 1.15.0, 1.14.1
    • None
    • None

    Description

      Observed on latest 0.14.x commit, c241170d9bc2cd8415b04e06ecea40ed3d80f64d.

      When debug logging is enabled, tests that instantiate a ParquetFileReader fail with:

       

       

      java.lang.RuntimeException: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.apache.parquet.schema.LogicalTypeAnnotation$StringLogicalTypeAnnotation and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: org.apache.parquet.hadoop.metadata.ParquetMetadata["fileMetaData"]->org.apache.parquet.hadoop.metadata.FileMetaData["schema"]->org.apache.parquet.schema.MessageType["fields"]->java.util.ArrayList[24]->org.apache.parquet.schema.PrimitiveType["logicalTypeAnnotation"])
      at org.apache.parquet.hadoop.metadata.ParquetMetadata.toJSON(ParquetMetadata.java:68)
      at org.apache.parquet.hadoop.metadata.ParquetMetadata.toPrettyJSON(ParquetMetadata.java:48)
      at org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:1592)
      at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:629)
      at org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:902)
      at org.apache.parquet.hadoop.ParquetFileReader.open(ParquetFileReader.java:698)
      at org.apache.parquet.hadoop.ColumnIndexValidator.checkContractViolations(ColumnIndexValidator.java:556)
      at org.apache.parquet.statistics.TestColumnIndexes.testColumnIndexes(TestColumnIndexes.java:348)
      Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.apache.parquet.schema.LogicalTypeAnnotation$StringLogicalTypeAnnotation and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: org.apache.parquet.hadoop.metadata.ParquetMetadata["fileMetaData"]->org.apache.parquet.hadoop.metadata.FileMetaData["schema"]->org.apache.parquet.schema.MessageType["fields"]->java.util.ArrayList[24]->org.apache.parquet.schema.PrimitiveType["logicalTypeAnnotation"])
      at com.fasterxml.jackson.databind.exc.InvalidDefinitionException.from(InvalidDefinitionException.java:77)
      at com.fasterxml.jackson.databind.SerializerProvider.reportBadDefinition(SerializerProvider.java:1330)
      at com.fasterxml.jackson.databind.DatabindContext.reportBadDefinition(DatabindContext.java:414)
      at com.fasterxml.jackson.databind.ser.impl.UnknownSerializer.failForEmpty(UnknownSerializer.java:53)
      at com.fasterxml.jackson.databind.ser.impl.UnknownSerializer.serialize(UnknownSerializer.java:30)
      at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:732)
      at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:770)
      at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:183)
      

      (note, this seems to be happening when the schema doesn't contain a logical type, which makes me suspect some Jackson configuration to handle null values is needed?)

       

      I also see a few exceptions related to encryption:

      2024-05-07 14:37:12 ERROR TestPropertiesDrivenEncryption - ENCRYPT_COLUMNS_AND_FOOTER_CTR - DECRYPT_WITH_KEY_RETRIEVER Error: Didn't expect an exception, but got [com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.apache.parquet.crypto.keytools.FileKeyUnwrapper and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: org.apache.parquet.hadoop.metadata.FileMetaData["fileDecryptor"]->org.apache.parquet.crypto.InternalFileDecryptor["decryptionProperties"]->org.apache.parquet.crypto.FileDecryptionProperties["keyRetriever"])]
      14185java.lang.RuntimeException: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.apache.parquet.crypto.keytools.FileKeyUnwrapper and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: org.apache.parquet.hadoop.metadata.FileMetaData["fileDecryptor"]- 

      To repro, enable debug logging or just comment out `if (LOG.isDebugEnabled())` in ParquetMetadataConverter, as I did here: https://github.com/apache/parquet-mr/compare/master...clairemcginty:parquet-mr:repro-avro-metadata-print-bug?expand=1

      Attachments

        Activity

          People

            RustedBones Michel Davit
            clairemcginty Claire McGinty
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: