Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2727

Contention on the Raft consensus lock can cause tablet service queue overflows

    XMLWordPrintableJSON

Details

    Description

      Here's stacks illustrating the phenomenon:

        tids=[2201]
              0x379ba0f710 <unknown>
                 0x1fb951a base::internal::SpinLockDelay()
                 0x1fb93b7 base::SpinLock::SlowLock()
                  0xb4e68e kudu::consensus::Peer::SignalRequest()
                  0xb9c0df kudu::consensus::PeerManager::SignalRequest()
                  0xb8c178 kudu::consensus::RaftConsensus::Replicate()
                  0xaab816 kudu::tablet::TransactionDriver::Prepare()
                  0xaac0ed kudu::tablet::TransactionDriver::PrepareTask()
                 0x1fa37ed kudu::ThreadPool::DispatchThread()
                 0x1f9c2a1 kudu::Thread::SuperviseThread()
              0x379ba079d1 start_thread
              0x379b6e88fd clone
        tids=[4515]
              0x379ba0f710 <unknown>
                 0x1fb951a base::internal::SpinLockDelay()
                 0x1fb93b7 base::SpinLock::SlowLock()
                  0xb74c60 kudu::consensus::RaftConsensus::NotifyCommitIndex()
                  0xb59307 kudu::consensus::PeerMessageQueue::NotifyObserversTask()
                  0xb54058 _ZN4kudu8internal7InvokerILi2ENS0_9BindStateINS0_15RunnableAdapterIMNS_9consensus16PeerMessageQueueEFvRKSt8functionIFvPNS4_24PeerMessageQueueObserverEEEEEEFvPS5_SC_EFvNS0_17UnretainedWrapperIS5_EEZNS5_34NotifyObserversOfCommitIndexChangeElEUlS8_E_EEESH_E3RunEPNS0_13BindStateBaseE
                 0x1fa37ed kudu::ThreadPool::DispatchThread()
                 0x1f9c2a1 kudu::Thread::SuperviseThread()
              0x379ba079d1 start_thread
              0x379b6e88fd clone
        tids=[22185,22194,22193,22188,22187,22186]
              0x379ba0f710 <unknown>
                 0x1fb951a base::internal::SpinLockDelay()
                 0x1fb93b7 base::SpinLock::SlowLock()
                  0xb8bff8 kudu::consensus::RaftConsensus::CheckLeadershipAndBindTerm()
                  0xaaaef9 kudu::tablet::TransactionDriver::ExecuteAsync()
                  0xaa3742 kudu::tablet::TabletReplica::SubmitWrite()
                  0x92812d kudu::tserver::TabletServiceImpl::Write()
                 0x1e28f3c kudu::rpc::GeneratedServiceIf::Handle()
                 0x1e2986a kudu::rpc::ServicePool::RunThread()
                 0x1f9c2a1 kudu::Thread::SuperviseThread()
              0x379ba079d1 start_thread
              0x379b6e88fd clone
        tids=[22192,22191]
              0x379ba0f710 <unknown>
                 0x1fb951a base::internal::SpinLockDelay()
                 0x1fb93b7 base::SpinLock::SlowLock()
                 0x1e13dec kudu::rpc::ResultTracker::TrackRpc()
                 0x1e28ef5 kudu::rpc::GeneratedServiceIf::Handle()
                 0x1e2986a kudu::rpc::ServicePool::RunThread()
                 0x1f9c2a1 kudu::Thread::SuperviseThread()
              0x379ba079d1 start_thread
              0x379b6e88fd clone
        tids=[4426]
              0x379ba0f710 <unknown>
                 0x206d3d0 <unknown>
                 0x212fd25 google::protobuf::Message::SpaceUsedLong()
                 0x211dee4 google::protobuf::internal::GeneratedMessageReflection::SpaceUsedLong()
                  0xb6658e kudu::consensus::LogCache::AppendOperations()
                  0xb5c539 kudu::consensus::PeerMessageQueue::AppendOperations()
                  0xb5c7c7 kudu::consensus::PeerMessageQueue::AppendOperation()
                  0xb7c675 kudu::consensus::RaftConsensus::AppendNewRoundToQueueUnlocked()
                  0xb8c147 kudu::consensus::RaftConsensus::Replicate()
                  0xaab816 kudu::tablet::TransactionDriver::Prepare()
                  0xaac0ed kudu::tablet::TransactionDriver::PrepareTask()
                 0x1fa37ed kudu::ThreadPool::DispatchThread()
                 0x1f9c2a1 kudu::Thread::SuperviseThread()
              0x379ba079d1 start_thread
              0x379b6e88fd clone
      

      kudu::consensus::RaftConsensus::CheckLeadershipAndBindTerm() needs to take the lock to check the term and the Raft role. When many RPCs come in for the same tablet, the contention can hog service threads and cause queue overflows on busy systems.

      Yugabyte switched their equivalent lock to be an atomic that allows them to read the term and role wait-free.

      Attachments

        Issue Links

          Activity

            People

              aserbin Alexey Serbin
              wdberkeley William Berkeley
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: