Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.12.0
-
None
Description
We have a customer Processor that uses ParquetReader to parse incoming Flow File and then send records to the target system. It worked before, but broke with NiFi 1.12.0 release.
The exception stack trace:
Caused by: java.lang.NullPointerException: Name is null
at java.lang.Enum.valueOf(Enum.java:236)
at org.apache.parquet.hadoop.metadata.CompressionCodecName.valueOf(CompressionCodecName.java:26)
at org.apache.nifi.parquet.utils.ParquetUtils.createParquetConfig(ParquetUtils.java:172)
at org.apache.nifi.parquet.ParquetReader.createRecordReader(ParquetReader.java:48)
at org.apache.nifi.serialization.RecordReaderFactory.createRecordReader(RecordReaderFactory.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:254)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:105)
at com.sun.proxy.$Proxy100.createRecordReader(Unknown Source)
at io.pivotal.greenplum.nifi.processors.PutGreenplumRecord.lambda$null$1(PutGreenplumRecord.java:300)
at org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127)
Basically, the creation of a ParquetReader by the RecordReaderFactory fails.
The actual problem occurs here: https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-parquet-bundle/nifi-parquet-processors/src/main/java/org/apache/nifi/parquet/utils/ParquetUtils.java#L171-L172
where
final String compressionTypeValue = context.getProperty(ParquetUtils.COMPRESSION_TYPE).getValue();
comes back with the value of null, since "compression-type" property is not exposed on ParquetReader and would not be set by the flow designers. The returned null value is then passed to get the enum instance and fails there.
final CompressionCodecName codecName = CompressionCodecName.valueOf(compressionTypeValue);
While there might be several solutions to this, including updating parquet-specific defaulting logic, I traced the root cause of this regression to the fix for NIFI-7635, to this commit: https://github.com/apache/nifi/commit/4f11e3626093d3090f97c0efc5e229d83b6006e4#diff-782335ecee68f6939c3724dba3983d3d
where the default value of provided property descriptor, expressed previously as
property.getDefaultValue()
is no longer used. That value used to contain UNCOMPRESSED value for the use case in question and it used to work before this commit.
I'd think the issue needs to get fixed in this place as it might affect a variety of other use cases.