Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-8573

Container stuck in PULLING when Docker daemon hangs

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5.0
    • 1.4.2, 1.5.1, 1.6.0
    • None

    Description

      When the force argument is not set to true, Docker::pull will always perform a docker inspect call before it does a docker pull. If either of these two Docker CLI calls hangs indefinitely, the Docker container will be stuck in the PULLING state. This means that we make no further progress in the launch() call path, so the executor binary is never executed, the Future associated with the launch() call is never failed or satisfied, and wait() is never called on the container. The agent chains the executor cleanup onto that wait() call which is never made. So, when the executor registration timeout elapses, containerizer->destroy() is called on the executor container, but the rest of the executor cleanup is never performed, and no terminal task status update is sent.

      This leaves the task destined for that Docker executor stuck in TASK_STAGING from the framework's perspective, and attempts to kill the task will fail.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            gilbert Gilbert Song
            greggomann Greg Mann
            Greg Mann Greg Mann
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Agile

                Completed Sprints:
                Mesosphere Sprint 74 ended 15/Feb/18
                Mesosphere Sprint 75 ended 03/Mar/18
                View on Board

                Slack

                  Issue deployment