Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-2115

Improve recovering Docker containers when slave is contained

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Epic
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.23.0
    • docker
    • Recover docker in slave container
    • Mesosphere Q4 Sprint 3 - 12/7, Mesosphere Q1 Sprint 1 - 1/23, Mesosphere Q1 Sprint 2 - 2/6, Mesosphere Q1 Sprint 3 - 2/20, Mesosphere Q1 Sprint 4 - 3/6, Mesosphere Q1 Sprint 5 - 3/20, Mesosphere Q1 Sprint 6 - 4/3, Mesosphere Q1 Sprint 7 - 4/17, Mesosphere Q2 Sprint 8 - 5/1, Mesosphere Q1 Sprint 9 - 5/15, Mesosphere Sprint 10

    Description

      Currently when docker containerizer is recovering it checks the checkpointed executor pids to recover which containers are still running, and remove the rest of the containers from docker ps that isn't recognized.

      This is problematic when the slave itself was in a docker container, as when the slave container dies all the forked processes are removed as well, so the checkpointed executor pids are no longer valid.

      We have to assume the docker containers might be still running even though the checkpointed executor pids are not.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tnachen Timothy Chen
            tnachen Timothy Chen
            Benjamin Hindman Benjamin Hindman
            Votes:
            4 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment