Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9669

TcpPeerServer should respect ipc.server.listen.queue.size

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.7.2
    • 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
    • None
    • None

    Description

      On periods of high traffic we are seeing:

      16/01/19 23:40:40 WARN hdfs.DFSClient: Connection failure: Failed to connect to /10.138.178.47:50010 for file /MYPATH/MYFILE for block BP-1935559084-10.138.112.27-1449689748174:blk_1080898601_7375294:java.io.IOException: Connection reset by peer
      java.io.IOException: Connection reset by peer
      	at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
      	at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
      	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
      	at sun.nio.ch.IOUtil.write(IOUtil.java:65)
      	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
      	at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
      	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
      	at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
      	at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
      	at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:109)
      	at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
      

      At the time that this happens there are way less xceivers than configured.

      On most JDK's this will make 50 the total backlog at any time. This effectively means that any GC + Busy time willl result in tcp resets.

      http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/net/ServerSocket.java#l370

      Attachments

        1. HDFS-9669.0.patch
          5 kB
          Elliott Neil Clark
        2. HDFS-9669.1.patch
          5 kB
          Elliott Neil Clark
        3. HDFS-9669.1.patch
          5 kB
          Elliott Neil Clark

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            eclark Elliott Neil Clark
            eclark Elliott Neil Clark
            Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment