Description
parquet-mr is unable to handle the following avro schema:
{"type": "record", "namespace": "com.cloudera.impala", "name": "table_3", "fields": [ {"name": "field_6", "type": {"type": "array", "items": ["null", {"type": "map", "values": ["null", "string"]}]}}]}
If map is null, the following exception happens:
java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at parquet.avro.AvroWriteSupport.writeMap(AvroWriteSupport.java:185) at parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:277) at parquet.avro.AvroWriteSupport.access$400(AvroWriteSupport.java:48) at parquet.avro.AvroWriteSupport$TwoLevelListWriter.writeCollection(AvroWriteSupport.java:473) at parquet.avro.AvroWriteSupport$ListWriter.writeList(AvroWriteSupport.java:322) at parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:275) at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:169) at parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:144) at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:116) at parquet.hadoop.ParquetWriter.write(ParquetWriter.java:324) at com.cloudera.impala.datagenerator.RandomNestedDataGenerator.writeFile(RandomNestedDataGenerator.java:69) at com.cloudera.impala.datagenerator.RandomNestedDataGenerator.main(RandomNestedDataGenerator.java:284)
The cause is probably because if there is a null value in the array, the TwoLevelListWriter does not check if an element is null: https://github.com/apache/parquet-mr/blob/master/parquet-avro/src/main/java/org/apache/parquet/avro/AvroWriteSupport.java#L456
Attachments
Issue Links
- blocks
-
PARQUET-392 Release Parquet-mr 1.9.0
- Resolved
- links to