Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Later
    • Fix Version/s: None
    • Component/s: Tools
    • Labels:

      Description

      We need a tool for both stressing and validating more complex workloads than stress currently supports. Stress needs a raft of changes, and I think it would be easier to deliver many of these as a single major endeavour which I think is justifiable given its audience. The rough behaviours I want stress to support are:

      • Ability to know exactly how many rows it will produce, for any clustering prefix, without generating those prefixes
      • Ability to generate an amount of data proportional to the amount it will produce to the server (or consume from the server), rather than proportional to the variation in clustering columns
      • Ability to reliably produce near identical behaviour each run
      • Ability to understand complex overlays of operation types (LWT, Delete, Expiry, although perhaps not all implemented immediately, the framework for supporting them easily)
      • Ability to (with minimal internal state) understand the complete cluster state through overlays of multiple procedural generations
      • Ability to understand the in-flight state of in-progress operations (i.e. if we're applying a delete, understand that the delete may have been applied, and may not have been, for potentially multiple conflicting in flight operations)

      I think the necessary changes to support this would give us the functional base to support all the functionality I can currently envisage stress needing. Before embarking on this (which I may attempt very soon), it would be helpful to get input from others as to features missing from stress that I haven't covered here that we will certainly want in the future, so that they can be factored in to the overall design and hopefully avoid another refactor one year from now, as its complexity is scaling each time, and each time it is a higher sunk cost. Jonathan Ellis Aleksey Yeschenko Sylvain Lebresne T Jake Luciani Ryan McGuire Ariel Weisberg Branimir Lambov Jonathan Shook ... and @everyone else

        Issue Links

          Activity

          Hide
          tjake T Jake Luciani added a comment -

          See CASSANDRA-8597

          My concern is this will be adding more complexity to an already complex tool. Can we agree on the user api/interactions before jumping into the implementation?

          Show
          tjake T Jake Luciani added a comment - See CASSANDRA-8597 My concern is this will be adding more complexity to an already complex tool. Can we agree on the user api/interactions before jumping into the implementation?
          Hide
          benedict Benedict added a comment -

          Well, my goal is to make the tool itself only a little more complex, but to support much more complex behaviours, which is why it needs a major overhaul (to introduce these behaviours with the current design would be prohibitively complex).

          We do need to think about the API, though, yes. Perhaps we should actually reduce the number of knobs: we could simply offer a distribution of total number of CQL rows in a partition, an optional ratio for each clustering column (defining where the row fan out occurs on average), and one of a category for how those rows are distributed: uniformly, normally and extremely are likely sufficient (without any tweaking parameters).

          Any other API pieces we should consider? I've considered if we shouldn't support Nashorn for value generation so that users can define their own arbitrary javascript, but this could have some performance implications. This is also orthogonal to this change.

          Almost all of CASSANDRA-8957 seems subsumed by this to me. Timeseries workloads are orthogonal to these changes, though, AFAICT, as they're basically just a matter of shifting the value domain based on seed.

          Show
          benedict Benedict added a comment - Well, my goal is to make the tool itself only a little more complex, but to support much more complex behaviours, which is why it needs a major overhaul (to introduce these behaviours with the current design would be prohibitively complex). We do need to think about the API, though, yes. Perhaps we should actually reduce the number of knobs: we could simply offer a distribution of total number of CQL rows in a partition, an optional ratio for each clustering column (defining where the row fan out occurs on average), and one of a category for how those rows are distributed: uniformly, normally and extremely are likely sufficient (without any tweaking parameters). Any other API pieces we should consider? I've considered if we shouldn't support Nashorn for value generation so that users can define their own arbitrary javascript, but this could have some performance implications. This is also orthogonal to this change. Almost all of CASSANDRA-8957 seems subsumed by this to me. Timeseries workloads are orthogonal to these changes, though, AFAICT, as they're basically just a matter of shifting the value domain based on seed.
          Hide
          tjake T Jake Luciani added a comment -

          I think we should have ONE way to use the tool, right now there are old old legacy, legacy, yaml, yaml + cli flags, cli-flags only.

          I think we should remove all forms of input other than yaml and some very light cli options. My reasoning is we have the best chance of documenting and capturing a reproducible profile, vs "I used this magic incantation of flags to get it to work... I think".

          Show
          tjake T Jake Luciani added a comment - I think we should have ONE way to use the tool, right now there are old old legacy, legacy, yaml, yaml + cli flags, cli-flags only. I think we should remove all forms of input other than yaml and some very light cli options. My reasoning is we have the best chance of documenting and capturing a reproducible profile, vs "I used this magic incantation of flags to get it to work... I think".
          Hide
          benedict Benedict added a comment - - edited

          I agree, but that is also independent of this goal. I plan to do that refactor first (as a separate ticket; I think I have a few related ones filed). I do intend to retain a "simple" mode, though, since the old mode is still used widely, but will transparently create a StressProfile to perform it.

          edit: ... actually, we may disagree a little. I want to ensure the profile can specify everything, but the cli is still a very useful way to override a number of properties, especially for scripting. Forcing users to write a separate yaml for every possible test is really ugly IMO.

          Show
          benedict Benedict added a comment - - edited I agree, but that is also independent of this goal. I plan to do that refactor first (as a separate ticket; I think I have a few related ones filed). I do intend to retain a "simple" mode, though, since the old mode is still used widely, but will transparently create a StressProfile to perform it. edit: ... actually, we may disagree a little. I want to ensure the profile can specify everything, but the cli is still a very useful way to override a number of properties, especially for scripting. Forcing users to write a separate yaml for every possible test is really ugly IMO.
          Hide
          enigmacurry Ryan McGuire added a comment - - edited

          Biggest thing I want is standardization on how we distribute stress across multiple clients (CASSANDRA-8469). Blindly running multiple clients is likely to not be compatible with validation - it probably needs a decent amount of coordination between clients.

          Show
          enigmacurry Ryan McGuire added a comment - - edited Biggest thing I want is standardization on how we distribute stress across multiple clients ( CASSANDRA-8469 ). Blindly running multiple clients is likely to not be compatible with validation - it probably needs a decent amount of coordination between clients.
          Hide
          benedict Benedict added a comment -

          CASSANDRA-8469 is another orthogonal issue, yeah. We really need to find a way for my evenings to not be the bottleneck on all of these stress features. There's months of development work to be done here.

          Show
          benedict Benedict added a comment - CASSANDRA-8469 is another orthogonal issue, yeah. We really need to find a way for my evenings to not be the bottleneck on all of these stress features. There's months of development work to be done here.
          Hide
          tjake T Jake Luciani added a comment -

          We really need to find a way for my evenings to not be the bottleneck on all of these stress features.

          Or others can help you

          Show
          tjake T Jake Luciani added a comment - We really need to find a way for my evenings to not be the bottleneck on all of these stress features. Or others can help you
          Hide
          benedict Benedict added a comment -

          That's my hope, yes

          Show
          benedict Benedict added a comment - That's my hope, yes
          Hide
          jshook Jonathan Shook added a comment -

          It is good to see the discussion move in this direction.

          Benedict, All,
          Nearly all of what you describe in the list of behaviors are on my list for another project as well. Although it's still a fairly new project, there have been some early successes with demos and training tools. Here is a link that explains the project and motives: https://github.com/jshook/metagener/blob/master/metagener-core/docs/README.md
          I'd be happy to talk in more detail about it. It seems like we have lots of the same ideas about what is needed at the foundational level.

          It's possible to achieve a drastic simplification of the user-facing part, but only if we are willing to revamp the notion of how we define test loads.

          RE: distributing test loads: I have been thinking about how to distribute stress across multiple clients as well. The gist of it is that we can't get there without having a way to automatically partition the client workload across some spectrum. As follow-on work, I think it can be done. First we need a conceptually obvious and clean way to define whole test loads such that they can be partitioned compatibly with the behaviors described above.

          If I can help, given the other work I've been doing, let's keep the conversation going.

          Show
          jshook Jonathan Shook added a comment - It is good to see the discussion move in this direction. Benedict , All, Nearly all of what you describe in the list of behaviors are on my list for another project as well. Although it's still a fairly new project, there have been some early successes with demos and training tools. Here is a link that explains the project and motives: https://github.com/jshook/metagener/blob/master/metagener-core/docs/README.md I'd be happy to talk in more detail about it. It seems like we have lots of the same ideas about what is needed at the foundational level. It's possible to achieve a drastic simplification of the user-facing part, but only if we are willing to revamp the notion of how we define test loads. RE: distributing test loads: I have been thinking about how to distribute stress across multiple clients as well. The gist of it is that we can't get there without having a way to automatically partition the client workload across some spectrum. As follow-on work, I think it can be done. First we need a conceptually obvious and clean way to define whole test loads such that they can be partitioned compatibly with the behaviors described above. If I can help, given the other work I've been doing, let's keep the conversation going.
          Hide
          philipthompson Philip Thompson added a comment -

          The refactor needs to add support for collections, see CASSANDRA-9091

          Show
          philipthompson Philip Thompson added a comment - The refactor needs to add support for collections, see CASSANDRA-9091
          Hide
          benedict Benedict added a comment -

          One more thing it should consider is the ability to efficiently compute multiple different projections of the same dataset, so that Global Indexes can be probed

          Show
          benedict Benedict added a comment - One more thing it should consider is the ability to efficiently compute multiple different projections of the same dataset, so that Global Indexes can be probed
          Hide
          benedict Benedict added a comment -

          Consider "split-brain", failed write, etc state validation

          Show
          benedict Benedict added a comment - Consider "split-brain", failed write, etc state validation
          Hide
          brianmhess Brian Hess added a comment -

          It would be good if you could leverage asynchronous execution instead of spawning so many threads yourself. You can have flags/options to limit the number of futures in flight (see FutureManager at https://github.com/brianmhess/cassandra-loader/blob/master/src/main/java/com/datastax/loader/futures/FutureManager.java for example) and the number of threads, in addition to limiting the rate (which is already an option).

          Show
          brianmhess Brian Hess added a comment - It would be good if you could leverage asynchronous execution instead of spawning so many threads yourself. You can have flags/options to limit the number of futures in flight (see FutureManager at https://github.com/brianmhess/cassandra-loader/blob/master/src/main/java/com/datastax/loader/futures/FutureManager.java for example) and the number of threads, in addition to limiting the rate (which is already an option).
          Hide
          benedict Benedict added a comment -
          Show
          benedict Benedict added a comment - See CASSANDRA-8468 and CASSANDRA-8987 .
          Hide
          benedict Benedict added a comment -

          I think for consistency of testing, and for realism of read workloads, we need stress to be able to build sstables directly to serve to each node when bootstrapping a test cluster, which has its compactions disabled. We can then specify the distribution of data amongst these sstables, so that we produce data that looks like a real cluster with live data might produce. Currently it is very hard to get a consistent state across different versions of a cluster for performing read tests, and replicating a realistic DTCS cluster state (for instance) is hard since we don't run for days, weeks or months (and artificially lowering the windows doesn't necessarily give us a realistic cluster state).

          I don't propose that this all be delivered as part of the rewrite, but I note it here to make certain it is considered at the same time, hopefully to be delivered soon after.

          Show
          benedict Benedict added a comment - I think for consistency of testing, and for realism of read workloads, we need stress to be able to build sstables directly to serve to each node when bootstrapping a test cluster, which has its compactions disabled. We can then specify the distribution of data amongst these sstables, so that we produce data that looks like a real cluster with live data might produce. Currently it is very hard to get a consistent state across different versions of a cluster for performing read tests, and replicating a realistic DTCS cluster state (for instance) is hard since we don't run for days, weeks or months (and artificially lowering the windows doesn't necessarily give us a realistic cluster state). I don't propose that this all be delivered as part of the rewrite, but I note it here to make certain it is considered at the same time, hopefully to be delivered soon after.

            People

            • Assignee:
              Unassigned
              Reporter:
              benedict Benedict
            • Votes:
              4 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development