Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.1.0
-
None
-
None
Description
We upgraded from 0.10.2.2 to 2.1.0 (a cluster with 3 brokers)
After a while (hours) 2 brokers start to throw:
java.io.IOException: Connection to NN was disconnected before the response was read at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97) at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:97) at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:190) at kafka.server.AbstractFetcherThread.kafka$server$AbstractFetcherThread$$processFetchRequest(AbstractFetcherThread.scala:241) at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:130) at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:129) at scala.Option.foreach(Option.scala:257) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:129) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:111) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
File descriptors start to pile up and if I do not restart it throws "Too many open files" and crashes.
ERROR Error while accepting connection (kafka.network.Acceptor) java.io.IOException: Too many open files in system at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250) at kafka.network.Acceptor.accept(SocketServer.scala:460) at kafka.network.Acceptor.run(SocketServer.scala:403) at java.lang.Thread.run(Thread.java:748)
After some hours the issue happens again... It has happened with all brokers, so it is not something specific to an instance.
Attachments
Attachments
Issue Links
- is duplicated by
-
KAFKA-7697 Possible deadlock in kafka.cluster.Partition
- Resolved