Description
I had a three node cluster with knox.
Time to time an error occured in the nifi logs on this cluster:
2023-11-15 13:25:51,637 INFO org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator: Received a status of 500 from xy:8443 for request PUT /nifi-api/process-groups/d2cedf64-018b-1000-0000-0000164a79fc when performing first stage of two-stage commit. The action will not occur. Node explanation: An unexpected error has occurred. Please check the logs for additional details.
Also sometimes I got "An unexpected error has occurred. Please check the logs for additional details." error on the UI too. After some investigation in I found the error in the logs:
23-11-15 13:40:25,289 ERROR [NiFi Web Server-78] o.a.nifi.web.api.config.ThrowableMapper An unexpected error has occurred: java.util.zip.ZipException: Not in GZIP format. Returning Internal Server Error response. java.util.zip.ZipException: Not in GZIP format at java.base/java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:176) at java.base/java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:79) at java.base/java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:91) at org.glassfish.jersey.message.GZipEncoder.decode(GZipEncoder.java:49) at org.glassfish.jersey.spi.ContentEncoder.aroundReadFrom(ContentEncoder.java:100) at org.glassfish.jersey.message.internal.ReaderInterceptorExecutor.proceed(ReaderInterceptorExecutor.java:132) at org.glassfish.jersey.server.internal.MappableExceptionWrapperInterceptor.aroundReadFrom(MappableExceptionWrapperInterceptor.java:49) at org.glassfish.jersey.message.internal.ReaderInterceptorExecutor.proceed(ReaderInterceptorExecutor.java:132) at org.glassfish.jersey.message.internal.MessageBodyFactory.readFrom(MessageBodyFactory.java:1072) at org.glassfish.jersey.message.internal.InboundMessageContext.readEntity(InboundMessageContext.java:919) at org.glassfish.jersey.server.ContainerRequest.readEntity(ContainerRequest.java:290) at org.glassfish.jersey.server.internal.inject.EntityParamValueParamProvider$EntityValueSupplier.apply(EntityParamValueParamProvider.java:73)
After many hours of debugging, I found out, that sometimes when I use the cluster via knox, some unknown reason the incoming "Accept-Encoding" come with all leters lowercase (which is valid, becase HTTP header is case insensitive - https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers). However OkHttpReplicationClient assume that the header is always "Accept-Encoding" (https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-cluster/src/main/java/org/apache/nifi/cluster/coordination/http/replication/okhttp/OkHttpReplicationClient.java#L294 and https://github.com/apache/nifi/blob/main/nifi-commons/nifi-site-to-site-client/src/main/java/org/apache/nifi/remote/protocol/http/HttpHeaders.java#L25). Because of that, during replication the client not use gzip compression but when other node get the requests, the jetty read the original "accept-encoding" header and try to uncompress the inputstream, which lead to the above error.
We need to add a few line code to the client to read the header case insensitivity