Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-18580

Baseline Metrics for Accord Transactions

    XMLWordPrintableJSON

Details

    Description

      Based on some conversations w/ benedict and dcapwell, this is the initial set of metrics that seem both feasible to implement and useful as we monitor the health of a cluster performing Accord transactions:

      1.) Basic latency metrics for transactions up to the point of COMMIT and rate metrics for preemption, failure, and timeouts at the coordinator.

      This has already been implemented and split into read and write-specific metrics. Our position for now is that metrics around preemption should be useful in place of a more difficult-to-define metric around how many transactions are completed via recovery.

      2.) Global cache stats/metrics (i.e. aggregated for all command stores)

      We could, at some point, build metrics scoped to a specific CommandStore, but they might be awkward in MBean/JMX space, as command stores would have to be identified by ID or key rangeā€¦the latter possibly being able to change across epochs. (An alternative would be just publishing command store-specific stats on-demand to a virtual table instead.)

      3.) Something like a decaying histogram of the number of dependencies per transaction (or per partial transaction).

      If this is getting worse over time, it could be useful to know/be a way for us to detect that contention is increasing. We should be able to hook this up to ProgressLog notifications. Recording for PartialDeps/PartialTxn (which ProgressLog gives us at pre-accept) seems acceptable, given this is a directional metric.

      Attachments

        Issue Links

          Activity

            People

              jlewandowski Jacek Lewandowski
              maedhroz Caleb Rackliffe
              Jacek Lewandowski
              Caleb Rackliffe, David Capwell, Henrik Ingo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 8h 40m
                  8h 40m