Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2720

Improve concurrency of ResultTracker

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.10.0
    • Fix Version/s: None
    • Component/s: perf
    • Labels:
      None

      Description

      Running a workload that's pushing many small batches from many clients, I see a lot of contention on the spinlock in the ResultTracker:

      Stacks at 0228 14:19:29.339088 (service queue overflowed for kudu.tserver.TabletServerService):
        tids=[17223]
              0x379ba0f710 <unknown>
                  0x89ee80 <unknown>
                 0x1fb8f72 base::internal::SpinLockDelay()
                 0x1fb8ea7 base::SpinLock::SlowLock()
                 0x1e138dc kudu::rpc::ResultTracker::TrackRpc()
                 0x1e289e5 kudu::rpc::GeneratedServiceIf::Handle()
                 0x1e2935a kudu::rpc::ServicePool::RunThread()
                 0x1f9bd91 kudu::Thread::SuperviseThread()
              0x379ba079d1 start_thread
              0x379b6e88fd clone
      ...
        tids=[5695,5673]
              0x379ba0f710 <unknown>
                 0x1fb900a base::internal::SpinLockDelay()
                 0x1fb8ea7 base::SpinLock::SlowLock()
                 0x1e11b60 kudu::rpc::ResultTracker::IsCurrentDriver()
                  0xaaaf16 kudu::tablet::TransactionDriver::Prepare()
                  0xaabbdd kudu::tablet::TransactionDriver::PrepareTask()
                 0x1fa32dd kudu::ThreadPool::DispatchThread()
                 0x1f9bd91 kudu::Thread::SuperviseThread()
              0x379ba079d1 start_thread
              0x379b6e88fd clone
        tids=[5689,5696,5693,5692,5691,5690,5698,5688,5681,5682,5683,5685,5686,5687,5700,5669,5668,5667,5714,5704,5703,5702,5701,5697,5670,5665,5699,5664,5671,5672,5680]
              0x379ba0f710 <unknown>
                 0x1fb900a base::internal::SpinLockDelay()
                 0x1fb8ea7 base::SpinLock::SlowLock()
                 0x1e11bcc kudu::rpc::ResultTracker::RecordCompletionAndRespond()
                 0x1e15e6c kudu::rpc::RpcContext::RespondSuccess()
                  0xaad024 kudu::tablet::TransactionDriver::Finalize()
                  0xaad531 kudu::tablet::TransactionDriver::ApplyTask()
                 0x1fa32dd kudu::ThreadPool::DispatchThread()
                 0x1f9bd91 kudu::Thread::SuperviseThread()
              0x379ba079d1 start_thread
              0x379b6e88fd clone
      

      The lock in this case is being held by

        tids=[5679]
              0x379ba0f710 <unknown>
                 0x212f81b google::protobuf::Message::SpaceUsedLong()
                 0x1e11f2f kudu::rpc::ResultTracker::RecordCompletionAndRespond()
                 0x1e15e6c kudu::rpc::RpcContext::RespondSuccess()
                  0xaad024 kudu::tablet::TransactionDriver::Finalize()
                  0xaad531 kudu::tablet::TransactionDriver::ApplyTask()
                 0x1fa32dd kudu::ThreadPool::DispatchThread()
                 0x1f9bd91 kudu::Thread::SuperviseThread()
              0x379ba079d1 start_thread
              0x379b6e88fd clone
      

      KUDU-1622 contained some suggestions for improving the ResultTracker. Some were implemented, but maybe we should consider implementing other suggestions there.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              wdberkeley William Berkeley
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: