Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11839

Active streams fail with SocketTimeoutException

    XMLWordPrintableJSON

    Details

    • Severity:
      Normal

      Description

      The original reasoning behind streaming_socket_timeout_in_ms was to kill one-sided hanging streams (CASSANDRA-3838). This was never much of a problem when the default was zero (never timeout).

      On CASSANDRA-8611 we changed the default to 1 hour, but it was never enforced due to CASSANDRA-11286, which was fixed recently.

      On recent releases we've been receiving reports of stream failures when streaming large files, because the sender incoming socket becomes inactive, times out after 1 hour, and the stream session fails with SocketTimeoutException (CASSANDRA-11345, CASSANDRA-11826), even though the stream session is still active. The session also fails if 2i/MV rebuild takes longer than 1 hour on the receiver (CASSANDRA-8343).

      The definitive fix on trunk is to add a KeepAlive message to the stream protocol to detect broken connections and retire streaming_socket_timeout_in_ms. But we must also increase the default streaming_socket_timeout_in_ms in older versions to a more conservative value, so it is still able to detect long hanging streams.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                pauloricardomg Paulo Motta
                Reporter:
                pauloricardomg Paulo Motta
                Authors:
                Paulo Motta
                Reviewers:
                Yuki Morishita
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: