Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
None
-
Normal
Description
I've done a series of stress tests with eager retries enabled that show undesirable behavior. I'm grouping these behaviours into one ticket as they are most likely related.
1) Killing off a node in a 4 node cluster actually increases performance.
2) Compactions make nodes slow, even after the compaction is done.
3) Eager Reads tend to lessen the immediate performance impact of a node going down, but not consistently.
My Environment:
1 stress machine: node0
4 C* nodes: node4, node5, node6, node7
My script:
node0 writes some data: stress -d node4 -F 30000000 -n 30000000 -i 5 -l 2 -K 20
node0 reads some data: stress -d node4 -n 30000000 -o read -i 5 -K 20
Examples:
A node going down increases performance:
At 450s, I kill -9 one of the nodes. There is a brief decrease in performance as the snitch adapts, but then it recovers... to even higher performance than before.
Compactions make nodes permanently slow:
The green and orange lines represent trials with eager retry enabled, they never recover their op-rate from before the compaction as the red and blue lines do.
Speculative Read tends to lessen the immediate impact:
This graph looked the most promising to me, the two trials with eager retry, the green and orange line, at 450s showed the smallest dip in performance.
But not always:
This is a retrial with the same settings as above, yet the 95percentile eager retry (red line) did poorly this time at 450s.
Attachments
Attachments
Issue Links
- is related to
-
CASSANDRA-4705 Speculative execution for reads / eager retries
- Resolved