Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9174

Unexpected containers transition from RUNNING to DESTROYING during recovery

    XMLWordPrintableJSON

Details

    Description

      I am trying to hunt down a weird issue where sometimes restarting a Mesos agent takes down all Mesos containers. The containers die without an apparent cause:

      I0821 13:35:01.486346 61392 linux_launcher.cpp:360] Recovered container 02da7be0-271e-449f-9554-dc776adb29a9
      I0821 13:35:03.627367 61362 provisioner.cpp:451] Recovered container 02da7be0-271e-449f-9554-dc776adb29a9
      I0821 13:35:03.701448 61375 containerizer.cpp:2835] Container 02da7be0-271e-449f-9554-dc776adb29a9 has exited
      I0821 13:35:03.701453 61375 containerizer.cpp:2382] Destroying container 02da7be0-271e-449f-9554-dc776adb29a9 in RUNNING state
      I0821 13:35:03.701457 61375 containerizer.cpp:2996] Transitioning the state of container 02da7be0-271e-449f-9554-dc776adb29a9 from RUNNING to DESTROYING
      

      From the perspective of the executor, there is nothing relevant in the logs. Everything just stops directly as if the container gets terminated externally without notifying the executor first. For further details, please see the attached agent log and one (example) executor log file.

      I am aware that this is a long shot, but anyone an idea what I should be looking at to narrow down the issue?

      Attachments

        1. mesos-executor-stderr.log
          4 kB
          Stephan Erb
        2. mesos-agent.log
          2.04 MB
          Stephan Erb

        Activity

          People

            Unassigned Unassigned
            StephanErb Stephan Erb
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: