Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2160

Reduce UpdateConsensus RPC timeouts

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.5.0
    • n/a
    • consensus
    • None

    Description

      We will often see many UpdateConsensus() RPC calls time out when disks are slow. We need to investigate this issue further and understand the dynamics better, then find a solution.

      When the local disks on a Kudu cluster get overloaded, RaftConsensus metadata fsyncs caused by Raft votes and term changes take longer, which causes the RaftConsensus lock to be held. This causes "stacking" of UpdateConsensus() RPCs, resulting in timeouts.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mpercy Mike Percy
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: