Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-8489

LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags is flaky

    XMLWordPrintableJSON

Details

    • Mesosphere Sprint 73, Mesosphere Sprint 74, Mesosphere Sprint 75, Mesosphere Sprint 76, Mesosphere Sprint 77
    • 8

    Description

      Observed this on internal Mesosphere CI.

      ../../src/tests/cluster.cpp:662: Failure
      Value of: containers->empty()
        Actual: false
      Expected: true
      Failed to destroy containers: { test }
      

      Steps to reproduce

      1. Add ::sleep(1); before removing "test" cgroup
      2. recompile
      3. run `GLOG_v=2 sudo GLOG_v=2 ./src/mesos-tests --gtest_filter=LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags --gtest_break_on_failure --gtest_repeat=10 --verbose`

      Race description

      While recovery is in progress for the first slave, calling `StartSlave()` leads to calling `slave::Containerizer::create()` to create a containerizer. An attempt to create a mesos c'zer, leads to calling `cgroups::prepare`. Finally, we get to the point, where we try to create a "test" container. So, the recovery process for the second slave might detect this "test" container as an orphaned container.

      Thus, there is the race between recovery process for the first slave and an attempt to create a c'zer for the second agent.

      Attachments

        1. ubuntu 9.04.png
          361 kB
          Andrei Budnik
        2. ROOT_IsolatorFlags-badrun3.txt
          31 kB
          Andrei Budnik

        Issue Links

          Activity

            People

              abudnik Andrei Budnik
              abudnik Andrei Budnik
              Gilbert Song Gilbert Song
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: