[FLINK-18129] Unhandled exception stack trace from DispatcherRestEndpoint when deploying Kubernetes session cluster - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Not a Priority
Resolution: Unresolved
Affects Version/s: 1.11.0
Fix Version/s: None
Component/s: Deployment / Kubernetes
Labels:
- auto-deprioritized-major
- auto-deprioritized-minor

Description

When deploying a session cluster on Kubernetes via bin/kubernetes-session.sh, I see the following stack trace in the master logs:

2020-06-04 01:17:52,068 WARN  org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint   [] - Unhandled exception
java.io.IOException: Connection reset by peer
	at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_252]
	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_252]
	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_252]
	at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_252]
	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:377) ~[?:1.8.0_252]
	at org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:247) ~[flink-dist_2.11-1.11.0.jar:1.11.0]
	at org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1140) ~[flink-dist_2.11-1.11.0.jar:1.11.0]
	at org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:347) ~[flink-dist_2.11-1.11.0.jar:1.11.0]
	at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148) [flink-dist_2.11-1.11.0.jar:1.11.0]
	at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:697) [flink-dist_2.11-1.11.0.jar:1.11.0]
	at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:632) [flink-dist_2.11-1.11.0.jar:1.11.0]
	at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:549) [flink-dist_2.11-1.11.0.jar:1.11.0]
	at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:511) [flink-dist_2.11-1.11.0.jar:1.11.0]
	at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918) [flink-dist_2.11-1.11.0.jar:1.11.0]
	at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [flink-dist_2.11-1.11.0.jar:1.11.0]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]

I am not entirely sure whether this is a configuration problem or a K8s service which does some liveness checks? The consequence is that the JM logs are being cluttered with these stack traces.

Most likely this is not caused by Flink but some K8s behavior. The question is whether we can do something about it if it occurs often.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Till Rohrmann

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 04/Jun/20 12:49

Updated:: 27/Nov/21 23:06