Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2707

Improve the performance of the block cache under contention

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 1.10.0
    • NA
    • None
    • None

    Description

      While looking at a random write workload where flushes outpace compactions (i.e. the typical case when inserting as fast as possible), there are occasional consensus service queue overflows. Analyzing the stacks of the service threads when this occurs (using the diagnostics log), I see many stacks like

      0x3b6720f710 <unknown>
                 0x1fb900a base::internal::SpinLockDelay()
                 0x1fb8ea7 base::SpinLock::SlowLock()
                 0x1ef7394 kudu::(anonymous namespace)::ShardedLRUCache::Lookup()
                 0x1ce379f kudu::cfile::BlockCache::Lookup()
                 0x1cec948 kudu::cfile::CFileReader::ReadBlock()
                 0x1ce5d36 kudu::cfile::BloomFileReader::CheckKeyPresent()
                  0xb311a1 kudu::tablet::CFileSet::CheckRowPresent()
                  0xac46c4 kudu::tablet::DiskRowSet::CheckRowPresent()
                  0xa6b017 _ZZN4kudu6tablet6Tablet17BulkCheckPresenceEPKNS_2fs9IOContextEPNS0_21WriteTransactionStateEENKUlvE1_clEv
                  0xa7427e _ZNSt17_Function_handlerIFvPN4kudu6tablet6RowSetEiEZNS1_6Tablet17BulkCheckPresenceEPKNS0_2fs9IOContextEPNS1_21WriteTransactionStateEEUlS3_iE2_E9_M_invokeERKSt9_Any_dataS3_i
                  0xaee074 _ZNK4kudu22interval_tree_internal6ITNodeINS_6tablet20RowSetIntervalTraitsEE31ForEachIntervalContainingPointsIZNKS2_10RowSetTree27ForEachRowSetContainingKeysERKSt6vectorINS_5SliceESaIS8_EERKSt8functionIFvPNS2_6RowSetEiEEEUlRKNS2_12_GLOBAL__N_111QueryStructEPNS2_16RowSetWithBoundsEE_N9__gnu_cxx17__normal_iteratorIPSM_S7_ISL_SaISL_EEEEEEvT0_SX_RKT_
                  0xaee1b3 _ZNK4kudu22interval_tree_internal6ITNodeINS_6tablet20RowSetIntervalTraitsEE31ForEachIntervalContainingPointsIZNKS2_10RowSetTree27ForEachRowSetContainingKeysERKSt6vectorINS_5SliceESaIS8_EERKSt8functionIFvPNS2_6RowSetEiEEEUlRKNS2_12_GLOBAL__N_111QueryStructEPNS2_16RowSetWithBoundsEE_N9__gnu_cxx17__normal_iteratorIPSM_S7_ISL_SaISL_EEEEEEvT0_SX_RKT_
                  0xaee3a3 kudu::tablet::RowSetTree::ForEachRowSetContainingKeys()
                  0xa80c17 kudu::tablet::Tablet::BulkCheckPresence()
                  0xa8108a kudu::tablet::Tablet::ApplyRowOperations()
      

      Note that the slow step in writes for these workloads is generally CPU usage in the apply phase, once they have been running for a while.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            wdberkeley William Berkeley

            Dates

              Created:
              Updated:

              Slack

                Issue deployment