Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7757

Too many open files after java.io.IOException: Connection to n was disconnected before the response was read

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.1.0
    • None
    • core
    • None

    Description

      We upgraded from 0.10.2.2 to 2.1.0 (a cluster with 3 brokers)

      After a while (hours) 2 brokers start to throw:

      java.io.IOException: Connection to NN was disconnected before the response was read
      at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97)
      at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:97)
      at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:190)
      at kafka.server.AbstractFetcherThread.kafka$server$AbstractFetcherThread$$processFetchRequest(AbstractFetcherThread.scala:241)
      at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:130)
      at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:129)
      at scala.Option.foreach(Option.scala:257)
      at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:129)
      at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:111)
      at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
      

      File descriptors start to pile up and if I do not restart it throws "Too many open files" and crashes.  

      ERROR Error while accepting connection (kafka.network.Acceptor)
      java.io.IOException: Too many open files in system
      at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
      at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
      at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
      at kafka.network.Acceptor.accept(SocketServer.scala:460)
      at kafka.network.Acceptor.run(SocketServer.scala:403)
      at java.lang.Thread.run(Thread.java:748)
      

       

       After some hours the issue happens again... It has happened with all brokers, so it is not something specific to an instance.

       

      Attachments

        1. server.properties
          0.7 kB
          Pedro Gontijo
        2. td1.txt
          81 kB
          Pedro Gontijo
        3. td3.txt
          82 kB
          Pedro Gontijo
        4. td2.txt
          82 kB
          Pedro Gontijo
        5. kafka-allocated-file-handles.png
          12 kB
          Mathias Kub
        6. Screen Shot 2019-01-03 at 12.20.38 PM.png
          17 kB
          Jeff Nadler
        7. fd-spike-threads.txt
          79 kB
          Jeff Nadler
        8. dump.txt
          165 kB
          arthur
        9. image-2021-04-29-11-24-22-704.png
          20 kB
          luws
        10. image-2021-04-29-11-25-41-208.png
          17 kB
          luws
        11. image-2021-04-29-11-26-34-894.png
          18 kB
          luws
        12. image-2021-04-29-11-27-12-924.png
          18 kB
          luws
        13. image-2021-04-29-11-27-35-679.png
          24 kB
          luws

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            pedrong Pedro Gontijo

            Dates

              Created:
              Updated:

              Slack

                Issue deployment