Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-8574

Docker executor makes no progress when 'docker inspect' hangs

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      In the Docker executor, many calls later in the executor's lifecycle are gated on an initial docker inspect call returning: https://github.com/apache/mesos/blob/bc6b61bca37752689cffa40a14c53ad89f24e8fc/src/docker/executor.cpp#L223

      If that first call to docker inspect never returns, the executor becomes stuck in a state where it makes no progress and cannot be killed.

      It's tempting for the executor to simply commit suicide after a timeout, but we must be careful of the case in which the executor's Docker container is actually running successfully, but the Docker daemon is unresponsive. In such a case, we do not want to send TASK_FAILED or TASK_KILLED if the task's container is running successfully.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            abudnik Andrei Budnik
            greggomann Greg Mann
            Gilbert Song Gilbert Song
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Agile

                Completed Sprints:
                Mesosphere Sprint 75 ended 03/Mar/18
                Mesosphere Sprint 76 ended 30/Mar/18
                View on Board

                Slack

                  Issue deployment