I performed two upgrades to the current cluster (currently 15 nodes, 1 DC, private VLAN),
first it was 22.214.171.124 and repair worked flawlessly,
second upgrade was to 3.0.9 (with upgradesstables) and also repair worked well,
then i upgraded 2 weeks ago to 3.9 - and the repair problems started.
there are several errors types from the system.log (different nodes) :
- Sync failed between /xxx.xxx.xxx.xxx and /xxx.xxx.xxx.xxx
- Streaming error occurred on session with peer xxx.xxx.xxx.xxx Operation timed out - received only 0 responses
- Remote peer xxx.xxx.xxx.xxx failed stream session
- Session completed with the following error
org.apache.cassandra.streaming.StreamException: Stream failed
i use 3.9 default configuration with the cluster settings adjustments (3 seeds, GossipingPropertyFileSnitch).
streaming_socket_timeout_in_ms is the default (86400000).
i'm afraid from consistency problems while i'm not performing repair.