Cassandra
  1. Cassandra
  2. CASSANDRA-5932

Speculative read performance data show unexpected results

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 2.0.2
    • Component/s: None
    • Labels:
      None

      Description

      I've done a series of stress tests with eager retries enabled that show undesirable behavior. I'm grouping these behaviours into one ticket as they are most likely related.

      1) Killing off a node in a 4 node cluster actually increases performance.
      2) Compactions make nodes slow, even after the compaction is done.
      3) Eager Reads tend to lessen the immediate performance impact of a node going down, but not consistently.

      My Environment:
      1 stress machine: node0
      4 C* nodes: node4, node5, node6, node7

      My script:
      node0 writes some data: stress -d node4 -F 30000000 -n 30000000 -i 5 -l 2 -K 20
      node0 reads some data: stress -d node4 -n 30000000 -o read -i 5 -K 20

      Examples:

      A node going down increases performance:

      Data for this test here

      At 450s, I kill -9 one of the nodes. There is a brief decrease in performance as the snitch adapts, but then it recovers... to even higher performance than before.

      Compactions make nodes permanently slow:


      The green and orange lines represent trials with eager retry enabled, they never recover their op-rate from before the compaction as the red and blue lines do.

      Data for this test here

      Speculative Read tends to lessen the immediate impact:


      This graph looked the most promising to me, the two trials with eager retry, the green and orange line, at 450s showed the smallest dip in performance.

      Data for this test here

      But not always:


      This is a retrial with the same settings as above, yet the 95percentile eager retry (red line) did poorly this time at 450s.

      Data for this test here

      1. eager-read-not-consistent.png
        61 kB
        Ryan McGuire
      2. eager-read-looks-promising.png
        53 kB
        Ryan McGuire
      3. compaction-makes-slow.png
        50 kB
        Ryan McGuire
      4. node-down-increase-performance.png
        32 kB
        Ryan McGuire
      5. eager-read-not-consistent-stats.png
        31 kB
        Ryan McGuire
      6. eager-read-looks-promising-stats.png
        31 kB
        Ryan McGuire
      7. compaction-makes-slow-stats.png
        32 kB
        Ryan McGuire
      8. 5932.txt
        23 kB
        Aleksey Yeschenko
      9. 5933-7a87fc11.png
        83 kB
        Ryan McGuire
      10. 5933-128_and_200rc1.png
        77 kB
        Ryan McGuire
      11. 5933-logs.tar.gz
        565 kB
        Ryan McGuire
      12. 5933-randomized-dsnitch-replica.png
        67 kB
        Ryan McGuire
      13. 5933-randomized-dsnitch-replica.2.png
        79 kB
        Ryan McGuire
      14. 5933-randomized-dsnitch-replica.3.png
        68 kB
        Ryan McGuire
      15. 5932.ded39c7e1c2fa.logs.tar.gz
        536 kB
        Ryan McGuire
      16. 5932-6692c50412ef7d.png
        76 kB
        Ryan McGuire
      17. 5932.6692c50412ef7d.compaction.png
        66 kB
        Ryan McGuire
      18. 5932.6692c50412ef7d.rr0.png
        99 kB
        Ryan McGuire
      19. 5932.6692c50412ef7d.rr1.png
        100 kB
        Ryan McGuire

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Aleksey Yeschenko
              Reporter:
              Ryan McGuire
              Reviewer:
              Jonathan Ellis
            • Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development