Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2978

NVM-based cache test scenario in cfile-test crashes on CentOS6

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 1.10.0
    • NA
    • None
    • None

    Description

      TestCFileBothCacheMemoryTypes.TestReadWriteLargeStrings started to crash with SIGSEGV pretty often if built and run on CentOS6.

      On other platforms that doesn't happen.

      To reproduce, run

      ./bin/cfile-test --gtest_filter='*CacheMemoryTypes/TestCFileBothCacheMemoryTypes.TestReadWriteLargeStrings/1'
      

      The stack trace looks like the following:

      *** Aborted at 1571371818 (unix time) try "date -d @1571371818" if you are using GNU date ***
      PC: @     0x7f020c89a719 kudu::(anonymous namespace)::ShardedLRUCache::Allocate()
      *** SIGSEGV (@0x7f01e424a4f8) received by PID 59228 (TID 0x7f020787f040) from PID 18446744073242191096; stack trace: ***
          @       0x3ae0e0f710 (unknown)
          @     0x7f020c89a719 kudu::(anonymous namespace)::ShardedLRUCache::Allocate()
          @     0x7f020cdf36a9 kudu::Cache::Allocate()
          @     0x7f020cdf2f3e kudu::cfile::BlockCache::Allocate()
          @     0x7f020ce029aa kudu::cfile::(anonymous namespace)::ScratchMemory::TryAllocateFromCache()
          @     0x7f020ce03298 kudu::cfile::CFileReader::ReadBlock()
          @     0x7f020ce072f0 kudu::cfile::CFileIterator::ReadCurrentDataBlock()
          @     0x7f020ce07b16 kudu::cfile::CFileIterator::QueueCurrentDataBlock()
          @     0x7f020ce081b4 kudu::cfile::CFileIterator::PrepareBatch()
          @     0x7f020ce0a460 kudu::cfile::CFileIterator::CopyNextValues()
          @           0x498b1e kudu::cfile::TestCFile::TestReadWriteStrings()
          @           0x499dd3 kudu::cfile::TestCFileBothCacheMemoryTypes_TestReadWriteLargeStrings_Test::TestBody()
          @     0x7f020caeeb98 testing::internal::HandleExceptionsInMethodIfSupported<>()
          @     0x7f020cadc1b2 testing::Test::Run()
          @     0x7f020cadc2f8 testing::TestInfo::Run()
          @     0x7f020cadc3d5 testing::TestCase::Run()
          @     0x7f020cae2ed8 testing::internal::UnitTestImpl::RunAllTests()
          @     0x7f020caef0a8 testing::internal::HandleExceptionsInMethodIfSupported<>()
          @     0x7f020cadc4ad testing::UnitTest::Run()
          @     0x7f020d2f0f7f RUN_ALL_TESTS()
          @     0x7f020d2eed90 main
          @       0x3ae0a1ed5d __libc_start_main
          @           0x4944d9 (unknown)
      Segmentation fault (core dumped)
      

      The suspects were few changelists:

      • 5e3af4e2ae45ee5f700b9c6c28d56ff84ffeb319: [util] modernize signature of Cache interface methods
      • 74a1d7706d99db2d9a14ed5d7c64afbcef853b20: [util] change return type of Cache::Allocate()
      • 946e2bc05419e3a552fed5a9d28e83861ff1eea1: KUDU-2605: replace nvml with memkind

      I reverted the first two (one-by-one), but the issue is still there.

      The test built with code right before the third changelist (i.e. at snapshot of revision 8410f0ca44e17ef6242cc9b25da49b568ddb0955) doesn't crash.

      The code with reverts of first two is in: https://github.com/alexeyserbin/kudu/commits/nvm-cache-crash

      Attachments

        Activity

          People

            aserbin Alexey Serbin
            aserbin Alexey Serbin
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: