Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11839

Active streams fail with SocketTimeoutException

    XMLWordPrintableJSON

Details

    • Normal

    Description

      The original reasoning behind streaming_socket_timeout_in_ms was to kill one-sided hanging streams (CASSANDRA-3838). This was never much of a problem when the default was zero (never timeout).

      On CASSANDRA-8611 we changed the default to 1 hour, but it was never enforced due to CASSANDRA-11286, which was fixed recently.

      On recent releases we've been receiving reports of stream failures when streaming large files, because the sender incoming socket becomes inactive, times out after 1 hour, and the stream session fails with SocketTimeoutException (CASSANDRA-11345, CASSANDRA-11826), even though the stream session is still active. The session also fails if 2i/MV rebuild takes longer than 1 hour on the receiver (CASSANDRA-8343).

      The definitive fix on trunk is to add a KeepAlive message to the stream protocol to detect broken connections and retire streaming_socket_timeout_in_ms. But we must also increase the default streaming_socket_timeout_in_ms in older versions to a more conservative value, so it is still able to detect long hanging streams.

      Attachments

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            People

              pauloricardomg Paulo Motta
              pauloricardomg Paulo Motta
              Paulo Motta
              Yuki Morishita
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: