Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-18110

Streaming progress virtual table lock contention can trigger TCP_USER_TIMEOUT and fail streaming

    XMLWordPrintableJSON

Details

    Description

      Running four concurrent host replacements on a 4.1.0 development cluster has repeatably failed to complete bootstrap with all four hosts failing bootsrrap and staying in JOINING, logging the message.

      ERROR 2022-12-07T21:15:48,860 [main] org.apache.cassandra.service.StorageService:2019 - Error while waiting on bootstrap to complete. Bootstrap will have to be restarted.
      

      Bootstrap fails as the the FileStreamTasks on the streaming followers encounter an EOF while transmitting the files.

      ERROR 2022-12-07T15:49:39,164 [NettyStreaming-Outbound-/1.2.3.4.7000:2] org.apache.cassandra.streaming.StreamSession:718 - [Stream #8d313690-7674-11ed-813f-95c261b64a82] Streaming error occurred on session with peer 1.2.3.4:7000 through 1.2.3.4:40292
      org.apache.cassandra.net.AsyncChannelOutputPlus$FlushException: The channel this output stream was writing to has been closed
             at org.apache.cassandra.net.AsyncChannelOutputPlus.propagateFailedFlush(AsyncChannelOutputPlus.java:200) ~[cassandra.jar]
             at org.apache.cassandra.net.AsyncChannelOutputPlus.waitUntilFlushed(AsyncChannelOutputPlus.java:158) ~[cassandra.jar]
             at org.apache.cassandra.net.AsyncChannelOutputPlus.waitForSpace(AsyncChannelOutputPlus.java:140) ~[cassandra.jar]
             at org.apache.cassandra.net.AsyncChannelOutputPlus.beginFlush(AsyncChannelOutputPlus.java:97) ~[cassandra.jar]
             at org.apache.cassandra.net.AsyncStreamingOutputPlus.lambda$writeToChannel$0(AsyncStreamingOutputPlus.java:124) ~[cassandra.jar]
             at org.apache.cassandra.db.streaming.CassandraCompressedStreamWriter.lambda$write$0(CassandraCompressedStreamWriter.java:89) ~[cassandra.jar]
             at org.apache.cassandra.net.AsyncStreamingOutputPlus.writeToChannel(AsyncStreamingOutputPlus.java:120) ~[cassandra.jar]
             at org.apache.cassandra.db.streaming.CassandraCompressedStreamWriter.write(CassandraCompressedStreamWriter.java:88) ~[cassandra.jar]
             at org.apache.cassandra.db.streaming.CassandraOutgoingFile.write(CassandraOutgoingFile.java:177) ~[cassandra.jar]
             at org.apache.cassandra.streaming.messages.OutgoingStreamMessage.serialize(OutgoingStreamMessage.java:87) ~[cassandra.jar]
             at org.apache.cassandra.streaming.messages.OutgoingStreamMessage$1.serialize(OutgoingStreamMessage.java:45) ~[cassandra.jar]
             at org.apache.cassandra.streaming.messages.OutgoingStreamMessage$1.serialize(OutgoingStreamMessage.java:34) ~[cassandra.jar]
             at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:39) ~[cassandra.jar]
             at org.apache.cassandra.streaming.async.StreamingMultiplexedChannel$FileStreamTask.run(StreamingMultiplexedChannel.java:311) [cassandra.jar]
             at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96) [cassandra.jar]
             at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) [cassandra.jar]
             at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) [cassandra.jar]
             at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
             at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
             at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-all-4.1.58.Final.jar:4.1.58.Final]
             at java.lang.Thread.run(Thread.java:829) [?:?]
             Suppressed: java.nio.channels.ClosedChannelException
                     at org.apache.cassandra.net.AsyncStreamingOutputPlus.doFlush(AsyncStreamingOutputPlus.java:82) ~[cassandra.jar]
                     at org.apache.cassandra.net.AsyncChannelOutputPlus.flush(AsyncChannelOutputPlus.java:229) ~[cassandra.jar]
                     at org.apache.cassandra.net.AsyncChannelOutputPlus.close(AsyncChannelOutputPlus.java:248) ~[cassandra.jar]
                     at org.apache.cassandra.streaming.async.NettyStreamingChannel$1.close(NettyStreamingChannel.java:141) ~[cassandra.jar]
                     at org.apache.cassandra.streaming.async.StreamingMultiplexedChannel$FileStreamTask.run(StreamingMultiplexedChannel.java:312) [cassandra.jar]
                     at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96) [cassandra.jar]
                     at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) [cassandra.jar]
                     at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) [cassandra.jar]
                     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
                     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
                     at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-all-4.1.58.Final.jar:4.1.58.Final]
                     at java.lang.Thread.run(Thread.java:829) [?:?]
      Caused by: io.netty.channel.unix.Errors$NativeIoException: writevAddresses(..) failed: Connection timed out
      
      

      Running a debug build with extra logging added to org.apache.cassandra.streaming.StreamSession#progress shows long periods where the streaming task does not log any progress events of up to 15 mins in duration. After these stalls the next attempt to write hits EOF.

      Example caught by extra logging in org.apache.cassandra.streaming.StreamSession#progress

      INFO  2022-12-07T15:10:56,149 [NettyStreaming-Outbound-/1.2.3.4.7000:2] org.apache.cassandra.streaming.StreamSession:1082 - Got stream progress event /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db 22216704/25673094 bytes (86%) s
      ent to idx:0/1.2.3.4 for file /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db bytes 22216704 out of total 25673094
      # ... 15min delay during which other streams can be seen transmitting to the same and other hosts
      INFO  2022-12-07T15:25:29,614 [NettyStreaming-Outbound-/1.2.3.4.7000:2] org.apache.cassandra.streaming.StreamSession:1082 - Got stream progress event /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db 22282240/25673094 bytes (86%) sent to idx:0/1.2.3.4 for file /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db bytes 22282240 out of total 25673094
      INFO  2022-12-07T15:25:29,615 [NettyStreaming-Outbound-/1.2.3.4.7000:2] org.apache.cassandra.streaming.StreamSession:1082 - Got stream progress event /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db 22347776/25673094 bytes (87%) sent to idx:0/1.2.3.4 for file /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db bytes 22347776 out of total 25673094
      
      
      INFO  2022-12-07T15:25:29,626 [NettyStreaming-Outbound-/1.2.3.4.7000:2] org.apache.cassandra.streaming.StreamSession:1082 - Got stream progress event /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db 23855104/25673094 bytes (92%) sent to idx:0/1.2.3.4 for file /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db bytes 23855104 out of total 25673094
      INFO  2022-12-07T15:25:29,627 [NettyStreaming-Outbound-/1.2.3.4.7000:2] org.apache.cassandra.streaming.StreamSession:1082 - Got stream progress event /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db 23920640/25673094 bytes (93%) sent to idx:0/1.2.3.4 for file /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db bytes 23920640 out of total 25673094
      # ...12min delay while other streams continue to transmit
      INFO  2022-12-07T15:37:29,775 [NettyStreaming-Outbound-/1.2.3.4.7000:2] org.apache.cassandra.streaming.StreamSession:1082 - Got stream progress event /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db 23986176/25673094 bytes (93%) sent to idx:0/1.2.3.4 for file /datadir/ks/tbl-xxx/ks-tbl-nb-87763-big-Data.db bytes 23986176 out of total 25673094
      

      The issue has not succesfully been reproduced in other clusters or in a dtest, but is reliable reproducible on the cluster. The hosts being used are relatively large and have an available CPU count of 32 causing the streaming system to allow up to 32 concurrent streams.

      https://github.com/apache/cassandra/blob/81c616826ab3c8aa96467771d0e074a874efdd77/src/java/org/apache/cassandra/streaming/async/StreamingMultiplexedChannel.java#L93

      One possible explanation is that the stream rate limiter is not being fair and blocking sends long enough for TCP user timeout to fire and cause the kernel to close the socket which is detected on the next write, another is something is taking unusually long flushing the socket in AsyncChannelOutputPlus.beginFlush.

      Restarting the instances with the streaming TCP user timeout disabled on the cluster (set to zero in the boostrapping instances config, and updated with JMX on the other instances) allowed the host-replacements to complete bootstrap successfully.

      Attachments

        Issue Links

          Activity

            People

              dcapwell David Capwell
              jonmeredith Jon Meredith
              David Capwell
              Abe Ratnofsky, Jon Meredith
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m