Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-18129

Unhandled exception stack trace from DispatcherRestEndpoint when deploying Kubernetes session cluster

    XMLWordPrintableJSON

Details

    Description

      When deploying a session cluster on Kubernetes via bin/kubernetes-session.sh, I see the following stack trace in the master logs:

      2020-06-04 01:17:52,068 WARN  org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint   [] - Unhandled exception
      java.io.IOException: Connection reset by peer
      	at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_252]
      	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_252]
      	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_252]
      	at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_252]
      	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:377) ~[?:1.8.0_252]
      	at org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:247) ~[flink-dist_2.11-1.11.0.jar:1.11.0]
      	at org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1140) ~[flink-dist_2.11-1.11.0.jar:1.11.0]
      	at org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:347) ~[flink-dist_2.11-1.11.0.jar:1.11.0]
      	at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148) [flink-dist_2.11-1.11.0.jar:1.11.0]
      	at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:697) [flink-dist_2.11-1.11.0.jar:1.11.0]
      	at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:632) [flink-dist_2.11-1.11.0.jar:1.11.0]
      	at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:549) [flink-dist_2.11-1.11.0.jar:1.11.0]
      	at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:511) [flink-dist_2.11-1.11.0.jar:1.11.0]
      	at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918) [flink-dist_2.11-1.11.0.jar:1.11.0]
      	at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [flink-dist_2.11-1.11.0.jar:1.11.0]
      	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
      

      I am not entirely sure whether this is a configuration problem or a K8s service which does some liveness checks? The consequence is that the JM logs are being cluttered with these stack traces.

      Most likely this is not caused by Flink but some K8s behavior. The question is whether we can do something about it if it occurs often.

      Attachments

        Activity

          People

            Unassigned Unassigned
            trohrmann Till Rohrmann
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: