Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2370

Allow accessing consensus metadata during flush/sync



    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.8.0
    • None
    • consensus, perf
    • None


      In some cases when disks are overloaded or starting to go bad, flushing consensus metadata can take a significant amount of time. Currently, we hold the RaftConsensus::lock_ for the duration of things like voting or changing term, which blocks other requests such as writes or UpdateConsensus calls. There are certainly some cases where exposing "dirty" (non-durable) cmeta is illegal from a Raft perspectives, but there are other cases where it is safe. For example:

      • assume we receive a Write request, and we see that cmeta is currently busy flushing a change that marks the local replica as a FOLLOWER. In that case, if we wait on the lock, when we eventually acquire it, we'll just reject the request anyway. We might as well reject it immediately.
      • Assume we receive a Write request, and we see that cmeta is currently flushing a change that will mark the local replica as a LEADER in the next term. CheckLeadershipAndBindTerm can safely bind to the upcoming term rather than blocking until the flush completes.
      • Assume we recieve an UpdateConsensus or Vote request for term N, and we see that we're currently flushing a change to term M > N. I think it's safe to reject the request even though the new term isn't yet durable.

      Probably a few other cases here where it's safe to act on not-yet-durable info.




            Unassigned Unassigned
            tlipcon Todd Lipcon
            0 Vote for this issue
            2 Start watching this issue