Description
When running my Flink job I get the following error:
04.Apr. 20:43:12 WARN DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.io.IOException: Network stream corrupted: invalid magicnumber in current envelope header.
at org.apache.flink.runtime.io.network.netty.InboundEnvelopeDecoder.decodeEnvelope(InboundEnvelopeDecoder.java:239)
at org.apache.flink.runtime.io.network.netty.InboundEnvelopeDecoder.decodeBuffer(InboundEnvelopeDecoder.java:127)
at org.apache.flink.runtime.io.network.netty.InboundEnvelopeDecoder.channelRead(InboundEnvelopeDecoder.java:111)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:125)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
Sometimes the job works, sometimes it fails with the above error.
When it fails, the job still appears as running, but nothing happens anymore until I cancel it manually. In the logs I can then find the error, often repeated hundreds of times.