Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-6199

Improve Stress Tool

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Fixed
    • 2.1 beta1
    • Legacy/Tools
    • None

    Description

      The stress tool could do with sprucing up. The following is a list of essential improvements and things that would be nice to have.

      Essential:

      • Reduce variability of results, especially start/end tails. Do not trash first/last 10% of readings
      • Reduce contention/overhead in stress to increase overall throughput
      • Short warm-up period, which is ignored for summary (or summarised separately), though prints progress as usual. Potentially automatic detection of rate levelling.
      • Better configurability and defaults for data generation - current column generation populates columns with the same value for every row, which is very easily compressible. Possibly introduce partial random data generator (possibly dictionary-based random data generator)

      Nice to have:

      • Calculate and print stdev and mean
      • Add batched sequential access mode (where a single thread performs batch-size sequential requests before selecting another random key) to test how key proximity affects performance
      • Auto-mode which attempts to establish the maximum throughput rate, by varying the thread count (or otherwise gating the number of parallel requests) for some period, then configures rate limit or thread count to test performance at e.g. 30%, 50%, 70%, 90%, 120%, 150% and unconstrained.
      • Auto-mode could have a target variance ratio for mean throughput and/or latency, and completes a test once this target is hit for x intervals
      • Fix key representation so independent of number of keys (possibly switch to 10 digit hex), and don't use String.format().getBytes() to construct it (expensive)
        Also, remove the skip-key setting, as it is currently ignored. Unless somebody knows the reason for it.
      • Fix latency stats
      • Read/write mode, with configurable recency-of-reads distribution
      • Add new exponential/extreme value distribution for value size, column count and recency-of-reads
      • Support more than 2^31 keys
      • Supports multiple concurrent stress inserts via key-offset parameter or similar

      Attachments

        1. new.read.latency.svg
          148 kB
          Benedict Elliott Smith
        2. new.read.rate.distribution.svg
          8 kB
          Benedict Elliott Smith
        3. new.write.latency.svg
          273 kB
          Benedict Elliott Smith
        4. new.write.rate.distribution.svg
          14 kB
          Benedict Elliott Smith
        5. old.read.latency.svg
          142 kB
          Benedict Elliott Smith
        6. old.read.rate.distribution.svg
          10 kB
          Benedict Elliott Smith
        7. old.write.latency.svg
          253 kB
          Benedict Elliott Smith
        8. old.write.rate.distribution.svg
          15 kB
          Benedict Elliott Smith
        9. ops.read.svg
          124 kB
          Benedict Elliott Smith
        10. ops.write.svg
          225 kB
          Benedict Elliott Smith
        11. patch
          431 kB
          Benedict Elliott Smith

        Issue Links

          Activity

            People

              benedict Benedict Elliott Smith
              benedict Benedict Elliott Smith
              Benedict Elliott Smith
              Pavel Yaskevich
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: