Details

      Description

      While trying to reproduce CASSANDRA-8343 I was not able to trigger a SocketTimeoutException by adding an artificial sleep longer than streaming_socket_timeout_in_ms.

      After investigation, I detected two problems:

      • ReadableByteChannel creation via socket.getChannel(), as done in ConnectionHandler.getReadChannel(socket), does not respect socket.setSoTimeout(), as explained in this blog post
        • The only difference between “blocking NIO” and “NIO wrapped around IO” is that you can’t use socket timeout with SocketChannels. Why ? Read a javadoc for setSocketTimeout(). It says that this timeout is used only by streams.

      • socketSoTimeout is never set on "follower" side, only on initiator side via DefaultConnectionFactory.createConnection(peer).

      This may cause streaming to hang indefinitely, as exemplified by CASSANDRA-8621:

      For the scenario that prompted this ticket, it appeared that the streaming process was completely stalled. One side of the stream (the sender side) had an exception that appeared to be a connection reset. The receiving side appeared to think that the connection was still active, at least in terms of the netstats reported by nodetool. We were unable to verify whether this was specifically the case in terms of connected sockets due to the fact that there were multiple streams for those peers, and there is no simple way to correlate a specific stream to a tcp session.

        Activity

        Hide
        pauloricardomg Paulo Motta added a comment -

        In order to verify that socket timeout was indeed not being respected, I added a new property cassandra.dtest.sleep_during_stream_write to this branch and set it to a value much longer than streaming_socket_timeout_in_ms on this dtest, and verified that the bootstrap streaming session hanged forever.

        Two changes are necessary to make stream socket timeout be enforced/respected during a stream session:

        • Creation of socket via Channels.newChannel(socket.getInputStream()); on ConnectionHandler.getReadChannel(socket), as suggested in the blog post.
        • Set socket timeout on follower side on IncomingStreamingConnection

        After these changes, I re-executed the previous dtest on a fixed branch and verified that the bootstrap stream session did not hang, but instead failed. The reason for the stream to fail and not to be retried is because SocketTimeoutException is an IOException, so it's catch by this block (and not this) on IncomingFileMessage.

        This behavior of failing on socket timeout is not the one documented on cassandra.yaml:

        # Enable socket timeout for streaming operation.
        # When a timeout occurs during streaming, streaming is retried from the start
        # of the current file. This _can_ involve re-streaming an important amount of
        # data, so you should avoid setting the value too low.
        # Default value is 3600000, which means streams timeout after an hour.
        # streaming_socket_timeout_in_ms: 3600000
        

        So I updated it to:

        # Set socket timeout for streaming operation.
        # The stream session is failed if no data is received by any of the
        # participants within that period.
        # Default value is 3600000, which means streams timeout after an hour.
        # streaming_socket_timeout_in_ms: 3600000
        

        While retrying when receiving corrupted data is probably the right approach, I'm not sure retrying on a socket timeout is desirable here. I can see two main reasons for the socket to timeout:

        • Connection was broken/reset in only one side of the socket (rare but possible situation)
        • Deadlock or protocol error on sender side

        In both scenarios, I think failing stream is the correct approach, rather than retrying and dealing with unexpected error conditions. WDYT Yuki Morishita?

        Below are branches with the suggested changes and tests.

        2.1 2.2 3.0 trunk
        branch branch branch branch
        testall testall testall testall
        dtest dtest dtest dtest

        commit info: minor conflict on 2.2, but other than that it merges cleanly upwards.

        Show
        pauloricardomg Paulo Motta added a comment - In order to verify that socket timeout was indeed not being respected, I added a new property cassandra.dtest.sleep_during_stream_write to this branch and set it to a value much longer than streaming_socket_timeout_in_ms on this dtest , and verified that the bootstrap streaming session hanged forever. Two changes are necessary to make stream socket timeout be enforced/respected during a stream session: Creation of socket via Channels.newChannel(socket.getInputStream()); on ConnectionHandler.getReadChannel(socket) , as suggested in the blog post . Set socket timeout on follower side on IncomingStreamingConnection After these changes, I re-executed the previous dtest on a fixed branch and verified that the bootstrap stream session did not hang, but instead failed. The reason for the stream to fail and not to be retried is because SocketTimeoutException is an IOException , so it's catch by this block (and not this ) on IncomingFileMessage . This behavior of failing on socket timeout is not the one documented on cassandra.yaml : # Enable socket timeout for streaming operation. # When a timeout occurs during streaming, streaming is retried from the start # of the current file. This _can_ involve re-streaming an important amount of # data, so you should avoid setting the value too low. # Default value is 3600000, which means streams timeout after an hour. # streaming_socket_timeout_in_ms: 3600000 So I updated it to: # Set socket timeout for streaming operation. # The stream session is failed if no data is received by any of the # participants within that period. # Default value is 3600000, which means streams timeout after an hour. # streaming_socket_timeout_in_ms: 3600000 While retrying when receiving corrupted data is probably the right approach, I'm not sure retrying on a socket timeout is desirable here. I can see two main reasons for the socket to timeout: Connection was broken/reset in only one side of the socket (rare but possible situation) Deadlock or protocol error on sender side In both scenarios, I think failing stream is the correct approach, rather than retrying and dealing with unexpected error conditions. WDYT Yuki Morishita ? Below are branches with the suggested changes and tests. 2.1 2.2 3.0 trunk branch branch branch branch testall testall testall testall dtest dtest dtest dtest commit info: minor conflict on 2.2, but other than that it merges cleanly upwards.
        Hide
        yukim Yuki Morishita added a comment -

        Thanks for the catch!
        Patches look good, though for 2.2+, bootstrap_test dtest was failing on your results. (I ran on my machine and the test was success though).
        Can you run dtests on 2.2+ again to make sure bootstrap_test is ok?

        Show
        yukim Yuki Morishita added a comment - Thanks for the catch! Patches look good, though for 2.2+, bootstrap_test dtest was failing on your results. (I ran on my machine and the test was success though). Can you run dtests on 2.2+ again to make sure bootstrap_test is ok?
        Hide
        pauloricardomg Paulo Motta added a comment -

        I managed to reproduce the dtest failure, which is different from the failure on CASSANDRA-10912. The problem is that sometimes, the stream process is killed before any range is transferred, so it's not possible to find the "already available. Skipping streaming." message in the logs.

        I submitted a DTEST PR that removes the non-deterministic check for "skipping streaming" message and instead added a stress check that the inserted data is present after the resumed bootstrap is completed.

        Show
        pauloricardomg Paulo Motta added a comment - I managed to reproduce the dtest failure, which is different from the failure on CASSANDRA-10912 . The problem is that sometimes, the stream process is killed before any range is transferred, so it's not possible to find the "already available. Skipping streaming." message in the logs. I submitted a DTEST PR that removes the non-deterministic check for "skipping streaming" message and instead added a stress check that the inserted data is present after the resumed bootstrap is completed.
        Hide
        pauloricardomg Paulo Motta added a comment -

        Canceling patch due to a minor nit that will be fixed shortly.

        Show
        pauloricardomg Paulo Motta added a comment - Canceling patch due to a minor nit that will be fixed shortly.
        Hide
        pauloricardomg Paulo Motta added a comment -

        There is a relatively common scenario when the receiver receives a COMPLETE message from the sender, changes its state to WAIT_COMPLETE and then the IncomingMessageHandler blocks on the socket again, and meanwhile the OnCompletionRunnable finishes processing the received stream and closes the session via maybeCompleted(). Even though the session is terminated correctly on both sides, after streaming_socket_timeout_in_ms the user may get a harmless but scary StreamSocketTimeout log message from the previously blocked socket (got this in dtests a few times). In order to fix this I updated the patch to close the OutgoingStreamHandler after sending the last COMPLETE message, and the IncomingStreamHandler after receiving the last COMPLETE message, so all resources are properly released.

        Below is an example of this scenario:

        DEBUG [STREAM-IN-/127.0.0.1] 2016-03-08 15:31:18,632 ConnectionHandler.java:269 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] Received Complete
        DEBUG [STREAM-OUT-/127.0.0.1] 2016-03-08 15:31:19,004 ConnectionHandler.java:341 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] Sending Complete
        DEBUG [StreamReceiveTask:1] 2016-03-08 15:31:19,004 ConnectionHandler.java:110 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] Closing stream connection handler on /127.0.0.1
        INFO  [StreamReceiveTask:1] 2016-03-08 15:31:19,005 StreamResultFuture.java:182 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] Session with /127.0.0.1 is complete
        INFO  [StreamReceiveTask:1] 2016-03-08 15:31:19,025 StreamResultFuture.java:214 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] All sessions completed
        
        .... {{streaming_socket_timeout}} passes...
        
        ERROR [STREAM-IN-/127.0.0.1] 2016-03-08 15:31:19,638 StreamSession.java:524 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] Streaming error occurred
        java.net.SocketTimeoutException: null
                at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) ~[na:1.8.0_66]
                at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) ~[na:1.8.0_66]
                at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) ~[na:1.8.0_66]
                at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:53) ~[main/:na]
                at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268) ~[main/:na]
                at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
        

        I also noticed that replace_address_test.TestReplaceAddress.replace_with_reset_resume_state_test (CASSANDRA-11246) and replace_address_test.TestReplaceAddress.resumable_replace_test was failing for the same reason as bootstrap_test.py:TestBootstrap.resumable_bootstrap_test (CASSANDRA-10912), so I applied the fix of CASSANDRA-10167 (decrease the socket timeout) to the dtest PR, which should avoid hanging on these tests due to default streaming_socket_timeout_in_ms of 1 hour.

        I rebased and updated branches and resubmitted tests:

        2.1 2.2 3.0 3.5 trunk dtest
        branch branch branch branch branch branch
        testall testall testall testall testall
        dtest dtest dtest dtest dtest
        Show
        pauloricardomg Paulo Motta added a comment - There is a relatively common scenario when the receiver receives a COMPLETE message from the sender, changes its state to WAIT_COMPLETE and then the IncomingMessageHandler blocks on the socket again, and meanwhile the OnCompletionRunnable finishes processing the received stream and closes the session via maybeCompleted() . Even though the session is terminated correctly on both sides, after streaming_socket_timeout_in_ms the user may get a harmless but scary StreamSocketTimeout log message from the previously blocked socket (got this in dtests a few times). In order to fix this I updated the patch to close the OutgoingStreamHandler after sending the last COMPLETE message, and the IncomingStreamHandler after receiving the last COMPLETE message, so all resources are properly released. Below is an example of this scenario: DEBUG [STREAM-IN-/127.0.0.1] 2016-03-08 15:31:18,632 ConnectionHandler.java:269 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] Received Complete DEBUG [STREAM-OUT-/127.0.0.1] 2016-03-08 15:31:19,004 ConnectionHandler.java:341 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] Sending Complete DEBUG [StreamReceiveTask:1] 2016-03-08 15:31:19,004 ConnectionHandler.java:110 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] Closing stream connection handler on /127.0.0.1 INFO [StreamReceiveTask:1] 2016-03-08 15:31:19,005 StreamResultFuture.java:182 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] Session with /127.0.0.1 is complete INFO [StreamReceiveTask:1] 2016-03-08 15:31:19,025 StreamResultFuture.java:214 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] All sessions completed .... {{streaming_socket_timeout}} passes... ERROR [STREAM-IN-/127.0.0.1] 2016-03-08 15:31:19,638 StreamSession.java:524 - [Stream #f19e82b0-e55b-11e5-9a50-f59bd42ef741] Streaming error occurred java.net.SocketTimeoutException: null at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) ~[na:1.8.0_66] at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) ~[na:1.8.0_66] at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) ~[na:1.8.0_66] at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:53) ~[main/:na] at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268) ~[main/:na] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] I also noticed that replace_address_test.TestReplaceAddress.replace_with_reset_resume_state_test ( CASSANDRA-11246 ) and replace_address_test.TestReplaceAddress.resumable_replace_test was failing for the same reason as bootstrap_test.py:TestBootstrap.resumable_bootstrap_test ( CASSANDRA-10912 ), so I applied the fix of CASSANDRA-10167 (decrease the socket timeout) to the dtest PR , which should avoid hanging on these tests due to default streaming_socket_timeout_in_ms of 1 hour. I rebased and updated branches and resubmitted tests: 2.1 2.2 3.0 3.5 trunk dtest branch branch branch branch branch branch testall testall testall testall testall dtest dtest dtest dtest dtest
        Hide
        pauloricardomg Paulo Motta added a comment -

        commit info: minor conflicts on 2.2 and 3.0.

        Show
        pauloricardomg Paulo Motta added a comment - commit info: minor conflicts on 2.2 and 3.0.
        Hide
        yukim Yuki Morishita added a comment -

        +1. Will commit.

        Show
        yukim Yuki Morishita added a comment - +1. Will commit.
        Hide
        yukim Yuki Morishita added a comment -

        Committed in 561000aa3094699bab29766d9644ff50f6cb74f3, thanks!

        Show
        yukim Yuki Morishita added a comment - Committed in 561000aa3094699bab29766d9644ff50f6cb74f3, thanks!

          People

          • Assignee:
            pauloricardomg Paulo Motta
            Reporter:
            pauloricardomg Paulo Motta
            Reviewer:
            Yuki Morishita
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development