Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-8188

don't block SocketThread for MessagingService

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 2.0.12, 2.1.2
    • None
    • None

    Description

      We have two datacenters A and B.
      The node in A cannot handshake version with nodes in B, logs in A as follow:

      	
      	INFO [HANDSHAKE-/B] 2014-10-24 04:29:49,075 OutboundTcpConnection.java (line 395) Cannot handshake version with B
          TRACE [WRITE-/B] 2014-10-24 11:02:49,044 OutboundTcpConnection.java (line 368) unable to connect to /B
      		java.net.ConnectException: Connection refused
              at sun.nio.ch.Net.connect0(Native Method)
              at sun.nio.ch.Net.connect(Net.java:364)
              at sun.nio.ch.Net.connect(Net.java:356)
              at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623)
              at java.nio.channels.SocketChannel.open(SocketChannel.java:184)
              at org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:134)
              at org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:119)
              at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:299)
              at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150)
      

      The jstack output of nodes in B shows it blocks in inputStream.readInt resulting in SocketThread not accept socket any more, logs as follow:

      	   java.lang.Thread.State: RUNNABLE
              at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
              at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
              at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
              at sun.nio.ch.IOUtil.read(IOUtil.java:197)
              at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
              - locked <0x00000007963747e8> (a java.lang.Object)
              at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:203)
              - locked <0x0000000796374848> (a java.lang.Object)
              at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
              - locked <0x00000007a5c7ca88> (a sun.nio.ch.SocketAdaptor$SocketInputStream)
              at java.io.InputStream.read(InputStream.java:101)
              at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
              - locked <0x00000007a5c7ca88> (a sun.nio.ch.SocketAdaptor$SocketInputStream)
              at java.io.DataInputStream.readInt(DataInputStream.java:387)
              at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:879)
      

      In nodes of B tcpdump shows retransmission of SYN,ACK during the tcp three-way handshake phase because tcp implementation drops the last ack when the backlog queue is full.

      In nodes of B ss -tl shows "Recv-Q 51 Send-Q 50".

      In nodes of B netstat -s shows “SYNs to LISTEN sockets dropped” and “times the listen queue of a socket overflowed” are both increasing.

      This patch sets read timeout to 2 * OutboundTcpConnection.WAIT_FOR_VERSION_MAX_TIME for the accepted socket.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            wy96f Wei Yang Assign to me
            wy96f Wei Yang
            Wei Yang
            Brandon Williams
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment