Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2708

Possible contention creating temporary files while flushing cmeta during an election storm

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: perf
    • Labels:
      None

      Description

      Doing investigation into consensus queue overflows that happen under heavy write load, I noticed 6/10 service threads at the time of overflow have stacks like

      0x3b6720f710 <unknown>
                 0x1fb900a base::internal::SpinLockDelay()
                 0x1fb8ea7 base::SpinLock::SlowLock()
                  0xb82e25 kudu::consensus::RaftConsensus::RequestVote()
                  0x931555 kudu::tserver::ConsensusServiceImpl::RequestConsensusVote()
                 0x1e28a2c kudu::rpc::GeneratedServiceIf::Handle()
                 0x1e2935a kudu::rpc::ServicePool::RunThread()
                 0x1f9bd91 kudu::Thread::SuperviseThread()
              0x3b672079d1 start_thread
              0x3b66ee88fd clone
      

      They are waiting on some tablet's Raft consensus instance's lock_ in order to vote. Looking into what might be holding that lock, I see stacks like

      0x3b6720f710 <unknown>
              0x3b66edb2ed __GI_open64
              0x3b66e63caa __gen_tempname
                 0x1f1cf35 kudu::(anonymous namespace)::PosixEnv::MkTmpFile()
                 0x1f1f662 kudu::(anonymous namespace)::PosixEnv::NewTempRWFile()
                 0x1f8305e kudu::pb_util::WritePBContainerToPath()
                  0xb47932 kudu::consensus::ConsensusMetadata::Flush()
                  0xb74164 kudu::consensus::RaftConsensus::SetVotedForCurrentTermUnlocked()
                  0xb783aa kudu::consensus::RaftConsensus::RequestVoteRespondVoteGranted()
                  0xb836a1 kudu::consensus::RaftConsensus::RequestVote()
                  0x931555 kudu::tserver::ConsensusServiceImpl::RequestConsensusVote()
                 0x1e28a2c kudu::rpc::GeneratedServiceIf::Handle()
                 0x1e2935a kudu::rpc::ServicePool::RunThread()
                 0x1f9bd91 kudu::Thread::SuperviseThread()
              0x3b672079d1 start_thread
              0x3b66ee88fd clone
      

      Doing some junior spelunking into glibc code, one hypothesis is that we are generating lots of collisions of proposed temporary file names in the cmeta folder because many threads are attempting to flush cmeta at once. The glibc code looks like

      Maybe we could put the thread id into the temporary file name when a thread does a cmeta flush.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                wdberkeley William Berkeley
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: