Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
M4
-
None
-
None
Description
I am occasionally getting a TSAN complaint when running a new test that does manual delta flushes.
The race is triggered by this code in DeltaTracker::Flush():
// Swap the DeltaMemStore to use the new schema old_dms = dms_; dms_.reset(new DeltaMemStore(old_dms->id() + 1, schema_, opid_anchor_registry_, parent_tracker_));
Racing with DeltaTracker::DeltaMemStoreSize() from another thread:
size_t DeltaTracker::DeltaMemStoreSize() const { return dms_->memory_footprint(); }
Seems we need to better protect access to dms_, either with a lock or by adding a ref to its count while accessing it maybe. TSAN message:
WARNING: ThreadSanitizer: data race (pid=2354) Write of size 8 at 0x7d540005f0e0 by main thread (mutexes: write M16555): #0 void std::swap<kudu::tablet::DeltaMemStore*>(kudu::tablet::DeltaMemStore*&, kudu::tablet::DeltaMemStore*&) /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/move.h:83 (libtablet.so+0x0000002af2ba) #1 std::tr1::__shared_ptr<kudu::tablet::DeltaMemStore, (__gnu_cxx::_Lock_policy)2>::swap(std::tr1::__shared_ptr<kudu::tablet::DeltaMemStore, (__gnu_cxx::_Lock_policy)2>&) /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/tr1/shared_ptr.h:551 (libtablet.so+0x0000002af260) #2 void std::tr1::__shared_ptr<kudu::tablet::DeltaMemStore, (__gnu_cxx::_Lock_policy)2>::reset<kudu::tablet::DeltaMemStore>(kudu::tablet::DeltaMemStore*) /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/tr1/shared_ptr.h:505 (libtablet.so+0x0000002ab865) #3 kudu::tablet::DeltaTracker::Flush(kudu::tablet::DeltaTracker::MetadataFlushType) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/delta_tracker.cc:412 (libtablet.so+0x0000002aab70) #4 kudu::tablet::DiskRowSet::FlushDeltas() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/diskrowset.cc:409 (libtablet.so+0x000000238bcb) #5 kudu::tablet::Tablet::FlushBiggestDMS() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:1381 (libtablet.so+0x0000001a1b02) #6 kudu::RemoteBootstrapTest_TestRemoteBootstrap_Test::TestBody() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/remote_bootstrap-test.cc:226 (remote_bootstrap-test+0x0000000b001b) #7 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null>:0 (libgtest.so+0x00000005f8a5) #8 __libc_start_main <null>:0 (libc.so.6+0x00000001ecdc) Previous read of size 8 at 0x7d540005f0e0 by thread T49 (mutexes: write M1271): #0 std::tr1::__shared_ptr<kudu::tablet::DeltaMemStore, (__gnu_cxx::_Lock_policy)2>::operator->() const /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/tr1/shared_ptr.h:525 (libtablet.so+0x0000002ac279) #1 kudu::tablet::DeltaTracker::DeltaMemStoreSize() const /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/delta_tracker.cc:470 (libtablet.so+0x0000002ab050) #2 kudu::tablet::DiskRowSet::DeltaMemStoreSize() const /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/diskrowset.cc:562 (libtablet.so+0x000000239ca7) #3 kudu::tablet::Tablet::DeltaMemStoresSize() const /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:1362 (libtablet.so+0x0000001a18df) #4 kudu::tablet::FlushDeltaMemStoresOp::UpdateStats(kudu::MaintenanceOpStats*) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:847 (libtablet.so+0x0000001c40ea) #5 kudu::MaintenanceManager::FindBestOp() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/maintenance_manager.cc:239 (libtablet.so+0x000000243a11) #6 kudu::MaintenanceManager::RunSchedulerThread() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/maintenance_manager.cc:187 (libtablet.so+0x000000242fc4) #7 boost::_mfi::mf0<void, kudu::MaintenanceManager>::operator()(kudu::MaintenanceManager*) const /usr/include/boost/bind/mem_fn_template.hpp:49 (libtablet.so+0x00000024794d) #8 void boost::_bi::list1<boost::_bi::value<kudu::MaintenanceManager*> >::operator()<boost::_mfi::mf0<void, kudu::MaintenanceManager>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, kudu::MaintenanceManager>&, boost::_bi::list0&, int) /usr/include/boost/bind/bind.hpp:246 (libtablet.so+0x0000002478ba) #9 boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::MaintenanceManager>, boost::_bi::list1<boost::_bi::value<kudu::MaintenanceManager*> > >::operator()() /usr/include/boost/bind/bind_template.hpp:20 (libtablet.so+0x000000247863) #10 boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::MaintenanceManager>, boost::_bi::list1<boost::_bi::value<kudu::MaintenanceManager*> > >, void>::invoke(boost::detail::function::function_buffer&) /usr/include/boost/function/function_template.hpp:153 (libtablet.so+0x000000247669) #11 boost::function0<void>::operator()() const /usr/include/boost/function/function_template.hpp:1012 (libtablet.so+0x0000001fb051) #12 kudu::Thread::SuperviseThread(void*) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/util/thread.cc:435 (libkudu_util.so+0x000000138a0b) Location is heap block of size 520 at 0x7d540005f000 allocated by main thread: #0 operator new(unsigned long) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/thirdparty/llvm-3.4.2.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:560 (remote_bootstrap-test+0x00000004590a) #1 kudu::tablet::DiskRowSet::Open() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/diskrowset.cc:398 (libtablet.so+0x000000238910) #2 kudu::tablet::DiskRowSet::Open(std::tr1::shared_ptr<kudu::metadata::RowSetMetadata> const&, kudu::log::OpIdAnchorRegistry*, std::tr1::shared_ptr<kudu::tablet::DiskRowSet>*, std::tr1::shared_ptr<kudu::MemTracker> const&) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/diskrowset.cc:376 (libtablet.so+0x00000023879a) #3 kudu::tablet::Tablet::DoCompactionOrFlush(kudu::Schema const&, kudu::tablet::RowSetsInCompaction const&, long) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:1058 (libtablet.so+0x00000019cdd4) #4 kudu::tablet::Tablet::FlushInternal(kudu::tablet::RowSetsInCompaction const&, std::tr1::shared_ptr<kudu::tablet::MemRowSet> const&, kudu::Schema const&) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:616 (libtablet.so+0x00000019c603) #5 kudu::tablet::Tablet::FlushUnlocked() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:557 (libtablet.so+0x00000019c042) #6 kudu::tablet::Tablet::Flush() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:538 (libtablet.so+0x00000019bf44) #7 kudu::RemoteBootstrapTest_TestRemoteBootstrap_Test::TestBody() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/remote_bootstrap-test.cc:225 (remote_bootstrap-test+0x0000000aff5f) #8 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null>:0 (libgtest.so+0x00000005f8a5) #9 __libc_start_main <null>:0 (libc.so.6+0x00000001ecdc) Mutex M16555 created at: #0 pthread_mutex_init /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/thirdparty/llvm-3.4.2.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:925 (remote_bootstrap-test+0x000000049a8c) #1 boost::mutex::mutex() /usr/include/boost/thread/pthread/mutex.hpp:37 (libtserver.so+0x000000098e1b) #2 kudu::tablet::DeltaTracker::DeltaTracker(std::tr1::shared_ptr<kudu::metadata::RowSetMetadata> const&, kudu::Schema const&, unsigned int, kudu::log::OpIdAnchorRegistry*, kudu::MemTracker*) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/delta_tracker.cc:41 (libtablet.so+0x0000002a763f) #3 kudu::tablet::DiskRowSet::Open() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/diskrowset.cc:398 (libtablet.so+0x000000238967) #4 kudu::tablet::DiskRowSet::Open(std::tr1::shared_ptr<kudu::metadata::RowSetMetadata> const&, kudu::log::OpIdAnchorRegistry*, std::tr1::shared_ptr<kudu::tablet::DiskRowSet>*, std::tr1::shared_ptr<kudu::MemTracker> const&) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/diskrowset.cc:376 (libtablet.so+0x00000023879a) #5 kudu::tablet::Tablet::DoCompactionOrFlush(kudu::Schema const&, kudu::tablet::RowSetsInCompaction const&, long) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:1058 (libtablet.so+0x00000019cdd4) #6 kudu::tablet::Tablet::FlushInternal(kudu::tablet::RowSetsInCompaction const&, std::tr1::shared_ptr<kudu::tablet::MemRowSet> const&, kudu::Schema const&) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:616 (libtablet.so+0x00000019c603) #7 kudu::tablet::Tablet::FlushUnlocked() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:557 (libtablet.so+0x00000019c042) #8 kudu::tablet::Tablet::Flush() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/tablet.cc:538 (libtablet.so+0x00000019bf44) #9 kudu::RemoteBootstrapTest_TestRemoteBootstrap_Test::TestBody() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/remote_bootstrap-test.cc:225 (remote_bootstrap-test+0x0000000aff5f) #10 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null>:0 (libgtest.so+0x00000005f8a5) #11 __libc_start_main <null>:0 (libc.so.6+0x00000001ecdc) Mutex M1271 created at: #0 pthread_mutex_init /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/thirdparty/llvm-3.4.2.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:925 (remote_bootstrap-test+0x000000049a8c) #1 boost::mutex::mutex() /usr/include/boost/thread/pthread/mutex.hpp:37 (libtserver.so+0x000000098e1b) #2 kudu::MaintenanceManager::MaintenanceManager(kudu::MaintenanceManager::Options const&) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/maintenance_manager.cc:93 (libtablet.so+0x000000242583) #3 kudu::tserver::TabletServer::TabletServer(kudu::tserver::TabletServerOptions const&) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tserver/tablet_server.cc:41 (libtserver.so+0x0000000a7060) #4 kudu::tserver::MiniTabletServer::Start() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tserver/mini_tablet_server.cc:64 (libtserver.so+0x000000093a80) #5 kudu::MiniCluster::AddTabletServer() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/mini_cluster.cc:102 (libintegration-tests.so+0x000000020107) #6 kudu::MiniCluster::Start() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/mini_cluster.cc:65 (libintegration-tests.so+0x00000001fbba) #7 kudu::RemoteBootstrapTest::Start() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/remote_bootstrap-test.cc:100 (remote_bootstrap-test+0x0000000c0187) #8 kudu::RemoteBootstrapTest::SetUp() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/remote_bootstrap-test.cc:81 (remote_bootstrap-test+0x0000000b6932) #9 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null>:0 (libgtest.so+0x00000005f8a5) #10 __libc_start_main <null>:0 (libc.so.6+0x00000001ecdc) Thread T49 'maintenance_sch' (tid=2734, running) created by main thread at: #0 pthread_create /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/thirdparty/llvm-3.4.2.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:877 (remote_bootstrap-test+0x0000000493df) #1 kudu::Thread::StartThread(std::string const&, std::string const&, boost::function<void ()()> const&, scoped_refptr<kudu::Thread>*) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/util/thread.cc:365 (libkudu_util.so+0x0000001384b6) #2 kudu::Status kudu::Thread::Create<boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::MaintenanceManager>, boost::_bi::list1<boost::_bi::value<kudu::MaintenanceManager*> > > >(std::string const&, std::string const&, boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::MaintenanceManager>, boost::_bi::list1<boost::_bi::value<kudu::MaintenanceManager*> > > const&, scoped_refptr<kudu::Thread>*) /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/util/thread.h:116 (libtablet.so+0x000000244baa) #3 kudu::MaintenanceManager::Init() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tablet/maintenance_manager.cc:106 (libtablet.so+0x000000242c87) #4 kudu::tserver::TabletServer::Start() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tserver/tablet_server.cc:95 (libtserver.so+0x0000000a7a5e) #5 kudu::tserver::MiniTabletServer::Start() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/tserver/mini_tablet_server.cc:66 (libtserver.so+0x000000093ad2) #6 kudu::MiniCluster::AddTabletServer() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/mini_cluster.cc:102 (libintegration-tests.so+0x000000020107) #7 kudu::MiniCluster::Start() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/mini_cluster.cc:65 (libintegration-tests.so+0x00000001fbba) #8 kudu::RemoteBootstrapTest::Start() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/remote_bootstrap-test.cc:100 (remote_bootstrap-test+0x0000000c0187) #9 kudu::RemoteBootstrapTest::SetUp() /data1/jenkins-workspace/kudu-gerrit/BUILD_TYPE/TSAN/label/kudu-gerrit-slaves/src/kudu/integration-tests/remote_bootstrap-test.cc:81 (remote_bootstrap-test+0x0000000b6932) #10 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null>:0 (libgtest.so+0x00000005f8a5) #11 __libc_start_main <null>:0 (libc.so.6+0x00000001ecdc) SUMMARY: ThreadSanitizer: data race /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/move.h:83 void std::swap<kudu::tablet::DeltaMemStore*>(kudu::tablet::DeltaMemStore*&, kudu::tablet::DeltaMemStore*&)