Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
1.17.0
-
None
-
None
Description
When compiled with GCC11 and newer, Kudu masters and tablet servers are crashing. The stack trace of kudu-master looks like the following:
PC: @ 0x7ff822b7cd1d __memmove_avx_unaligned_erms *** SIGSEGV (@0x8000) received by PID 189412 (TID 0x7ff81b4d5700) from PID 32768; stack trace: *** @ 0xe26c81 google::(anonymous namespace)::FailureSignalHandler() @ 0x7ff8246298c0 (unknown) @ 0x7ff822b7cd1d __memmove_avx_unaligned_erms @ 0x7ff824ea20fc (unknown) @ 0x132ceeb kudu::tablet::(anonymous namespace)::MRSRowProjectorImpl<>::ProjectRowForRead() @ 0x132c543 kudu::tablet::MemRowSet::Iterator::FetchRows() @ 0x132cb61 kudu::tablet::MemRowSet::Iterator::NextBlock() @ 0x2cddb7c kudu::PredicateEvaluatingIterator::NextBlock() @ 0x2cde488 kudu::UnionIterator::NextBlock() @ 0x12b5a29 kudu::tablet::Tablet::Iterator::NextBlock() @ 0xd6dfda kudu::master::SysCatalogTable::ProcessRows<>() @ 0xd66cae kudu::master::SysCatalogTable::VisitTables() @ 0xddeba8 kudu::master::MasterPathHandlers::HandleDumpEntities() @ 0x1275c2b kudu::Webserver::RunPathHandler() @ 0x12767b1 kudu::Webserver::BeginRequestCallback() @ 0x12b08fc handle_request @ 0x12b377c process_new_connection @ 0x12b3e80 worker_thread @ 0x7ff82461d6ea start_thread @ 0x7ff822b13a6f __GI___clone
The litmus test is to run codegen-test that is crashing with a similar stack trace:
# ./bin/codegen-test [==========] Running 12 tests from 1 test suite. [----------] Global test environment set-up. [----------] 12 tests from CodegenTest [ RUN ] CodegenTest.ObservablesTest I0420 17:19:27.839332 175031 test_util.cc:255] Using random seed: -1104489386 [ OK ] CodegenTest.ObservablesTest (217 ms) [ RUN ] CodegenTest.TestEmpty I0420 17:19:28.048970 175031 test_util.cc:255] Using random seed: -1104279736 [ OK ] CodegenTest.TestEmpty (138 ms) [ RUN ] CodegenTest.TestKey I0420 17:19:28.186726 175031 test_util.cc:255] Using random seed: -1104141979 [ OK ] CodegenTest.TestKey (125 ms) [ RUN ] CodegenTest.TestInts I0420 17:19:28.312000 175031 test_util.cc:255] Using random seed: -1104016705 [ OK ] CodegenTest.TestInts (144 ms) [ RUN ] CodegenTest.TestStrings I0420 17:19:28.455729 175031 test_util.cc:255] Using random seed: -1103872977 *** Aborted at 1682011168 (unix time) try "date -d @1682011168" if you are using GNU date *** PC: @ 0x7f3a14924508 __memmove_evex_unaligned_erms *** SIGSEGV (@0x0) received by PID 175031 (TID 0x7f3a1621bcc0) from PID 0; stack trace: *** @ 0x7c9d92 google::(anonymous namespace)::FailureSignalHandler() @ 0x7f3a15c7a8c0 (unknown) @ 0x7f3a14924508 __memmove_evex_unaligned_erms @ 0x7f3a162c50d4 (unknown) @ 0x7b5a0f kudu::CodegenTest::ProjectTestRows<>() @ 0x7bc13c kudu::CodegenTest::TestProjection<>() @ 0x7ad8ee kudu::CodegenTest_TestStrings_Test::TestBody() @ 0x84e517 testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x842e86 testing::Test::Run() @ 0x842ff5 testing::TestInfo::Run() @ 0x8430e5 testing::TestSuite::Run() @ 0x84362e testing::internal::UnitTestImpl::RunAllTests() @ 0x84e9f7 testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x843837 testing::UnitTest::Run() @ 0x75876a main @ 0x7f3a147ca29d __libc_start_main @ 0x7aaf0a _start Segmentation fault (core dumped)
As a workaround, disable the codegen when running kudu-master and kudu-tserver processes:
--mrs_use_codegen=false
Fixing this issue should unblock Kudu adoption on contemporary Linux distributions where GCC11 or newer is a system compiler (RH/CentOS 9, Ubuntu 22, etc.)
Attachments
Issue Links
- is required by
-
KUDU-3480 Ubuntu 18.04 EOL
- Open