Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-7817

ParquetReader fails to instantiate due to missing default value for compression-type property

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.12.0
    • 1.13.0
    • Core Framework
    • None

    Description

      We have a customer Processor that uses ParquetReader to parse incoming Flow File and then send records to the target system. It worked before, but broke with NiFi 1.12.0 release.

      The exception stack trace:

      Caused by: java.lang.NullPointerException: Name is null
       at java.lang.Enum.valueOf(Enum.java:236)
       at org.apache.parquet.hadoop.metadata.CompressionCodecName.valueOf(CompressionCodecName.java:26)
       at org.apache.nifi.parquet.utils.ParquetUtils.createParquetConfig(ParquetUtils.java:172)
       at org.apache.nifi.parquet.ParquetReader.createRecordReader(ParquetReader.java:48)
       at org.apache.nifi.serialization.RecordReaderFactory.createRecordReader(RecordReaderFactory.java:49)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:254)
       at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:105)
       at com.sun.proxy.$Proxy100.createRecordReader(Unknown Source)
       at io.pivotal.greenplum.nifi.processors.PutGreenplumRecord.lambda$null$1(PutGreenplumRecord.java:300)
       at org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127)

       

      Basically, the creation of a ParquetReader by the RecordReaderFactory fails.

      The actual problem occurs here: https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-parquet-bundle/nifi-parquet-processors/src/main/java/org/apache/nifi/parquet/utils/ParquetUtils.java#L171-L172

      where 

      final String compressionTypeValue = context.getProperty(ParquetUtils.COMPRESSION_TYPE).getValue();

      comes back with the value of null, since "compression-type" property is not exposed on ParquetReader and would not be set by the flow designers. The returned null value is then passed to get the enum instance and fails there.

      final CompressionCodecName codecName = CompressionCodecName.valueOf(compressionTypeValue);

      While there might be several solutions to this, including updating parquet-specific defaulting logic, I traced the root cause of this regression to the fix for NIFI-7635, to this commit: https://github.com/apache/nifi/commit/4f11e3626093d3090f97c0efc5e229d83b6006e4#diff-782335ecee68f6939c3724dba3983d3d

      where the default value of provided property descriptor, expressed previously as 

      property.getDefaultValue()

      is no longer used. That value used to contain UNCOMPRESSED value for the use case in question and it used to work before this commit.

      I'd think the issue needs to get fixed in this place as it might affect a variety of other use cases.

      Attachments

        Activity

          People

            pvillard Pierre Villard
            adenissov Alexander Denissov
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3h 50m
                3h 50m