Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16701

Data Points for the CommitLog's WaitingOnCommit Metric Should Describe Single Mutations

    XMLWordPrintableJSON

Details

    Description

      The metrics we have around the CommitLog aren’t as useful as they could be in the context of investigating the performance of local writes.

      1.) We have no way to know how long the actual flush to disk takes in isolation, i.e. separate from the signaling apparatus between mutation threads and the sync thread. We should add a metric for this.

      2.) The WaitingOnCommit metric can have multiple data points recorded for a single mutation, which is a little awkward when we’re trying to break down the latency of a local write (total time for CL add + Memtable put, etc.). More specifically, a thread waits for the sync thread to catch up to the position of its mutation, but it can wake up for a sync operation that hasn’t arrived there yet, which triggers another wait. A new data point is recorded for the metric each time this happens. We should move the scope of metric recording up a level so that there is a 1-1 relationship between it and WriteLatency in TableMetrics (which covers row cache updates and the Memtable put).

      void waitForSync(int position, Timer waitingOnCommit)
      {
          while (lastSyncedOffset < position)
          {
              WaitQueue.Signal signal = waitingOnCommit != null ?
                                        syncComplete.register(waitingOnCommit.time()) :
                                        syncComplete.register();
              if (lastSyncedOffset < position)
                  signal.awaitUninterruptibly();
              else
                  signal.cancel();
          }
      }
      

      Attachments

        Activity

          People

            maedhroz Caleb Rackliffe
            maedhroz Caleb Rackliffe
            Caleb Rackliffe
            Yifan Cai
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: