Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-311

NPE when debug logging file metadata

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.8.0
    • Fix Version/s: None
    • Component/s: parquet-mr
    • Labels:
      None

      Description

      When debug logging is enabled and when all values are null in a block, Parquet throws NPE when pretty printing the metadata as the metadata doesn't have min/max defined.

      java.io.IOException: Could not read footer: java.lang.RuntimeException: org.codehaus.jackson.map.JsonMappingException: (was java.lang.NullPointerException) (through reference chain: org.apache.parquet.hadoop.metadata.ParquetMetadata["blocks"]->java.util.ArrayList[0]->org.apache.parquet.hadoop.metadata.BlockMetaData["columns"]->java.util.UnmodifiableRandomAccessList[8]->org.apache.parquet.hadoop.metadata.IntColumnChunkMetaData["statistics"]->org.apache.parquet.column.statistics.BinaryStatistics["maxBytes"])
      	at org.apache.parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:247)
      	at org.apache.parquet.hadoop.ParquetFileReader.readAllFootersInParallelUsingSummaryFiles(ParquetFileReader.java:188)
      	at org.apache.parquet.hadoop.ParquetReader.<init>(ParquetReader.java:124)
      	at org.apache.parquet.hadoop.ParquetReader.<init>(ParquetReader.java:55)
      	at org.apache.parquet.hadoop.ParquetReader$Builder.build(ParquetReader.java:264)
      	at org.apache.parquet.hadoop.TestParquetVectorReader.testNullReads(TestParquetVectorReader.java:355)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
      	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
      	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
      	at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
      	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:68)
      Caused by: java.lang.RuntimeException: org.codehaus.jackson.map.JsonMappingException: (was java.lang.NullPointerException) (through reference chain: org.apache.parquet.hadoop.metadata.ParquetMetadata["blocks"]->java.util.ArrayList[0]->org.apache.parquet.hadoop.metadata.BlockMetaData["columns"]->java.util.UnmodifiableRandomAccessList[8]->org.apache.parquet.hadoop.metadata.IntColumnChunkMetaData["statistics"]->org.apache.parquet.column.statistics.BinaryStatistics["maxBytes"])
      	at org.apache.parquet.hadoop.metadata.ParquetMetadata.toJSON(ParquetMetadata.java:72)
      	at org.apache.parquet.hadoop.metadata.ParquetMetadata.toPrettyJSON(ParquetMetadata.java:62)
      	at org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:528)
      	at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:430)
      	at org.apache.parquet.hadoop.ParquetFileReader$2.call(ParquetFileReader.java:237)
      	at org.apache.parquet.hadoop.ParquetFileReader$2.call(ParquetFileReader.java:233)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: org.codehaus.jackson.map.JsonMappingException: (was java.lang.NullPointerException) (through reference chain: org.apache.parquet.hadoop.metadata.ParquetMetadata["blocks"]->java.util.ArrayList[0]->org.apache.parquet.hadoop.metadata.BlockMetaData["columns"]->java.util.UnmodifiableRandomAccessList[8]->org.apache.parquet.hadoop.metadata.IntColumnChunkMetaData["statistics"]->org.apache.parquet.column.statistics.BinaryStatistics["maxBytes"])
      	at org.codehaus.jackson.map.JsonMappingException.wrapWithPath(JsonMappingException.java:218)
      	at org.codehaus.jackson.map.JsonMappingException.wrapWithPath(JsonMappingException.java:183)
      	at org.codehaus.jackson.map.ser.std.SerializerBase.wrapAndThrow(SerializerBase.java:140)
      	at org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:158)
      	at org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112)
      	at org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446)
      	at org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150)
      	at org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112)
      	at org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:122)
      	at org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:71)
      	at org.codehaus.jackson.map.ser.std.AsArraySerializerBase.serialize(AsArraySerializerBase.java:86)
      	at org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446)
      	at org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150)
      	at org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112)
      	at org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:122)
      	at org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:71)
      	at org.codehaus.jackson.map.ser.std.AsArraySerializerBase.serialize(AsArraySerializerBase.java:86)
      	at org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446)
      	at org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150)
      	at org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112)
      	at org.codehaus.jackson.map.ser.StdSerializerProvider._serializeValue(StdSerializerProvider.java:610)
      	at org.codehaus.jackson.map.ser.StdSerializerProvider.serializeValue(StdSerializerProvider.java:256)
      	at org.codehaus.jackson.map.ObjectMapper._configAndWriteValue(ObjectMapper.java:2575)
      	at org.codehaus.jackson.map.ObjectMapper.writeValue(ObjectMapper.java:2081)
      	at org.apache.parquet.hadoop.metadata.ParquetMetadata.toJSON(ParquetMetadata.java:68)
      	... 9 more
      Caused by: java.lang.NullPointerException
      	at org.apache.parquet.column.statistics.BinaryStatistics.getMaxBytes(BinaryStatistics.java:56)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at org.codehaus.jackson.map.ser.BeanPropertyWriter.get(BeanPropertyWriter.java:483)
      	at org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:418)
      	at org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150)
      	... 30 more
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                nezihyigitbasi Nezih Yigitbasi
                Reporter:
                nezihyigitbasi Nezih Yigitbasi
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: