Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-6577

Failed to run docker inspect

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Workaround
    • 1.0.1
    • None
    • containerization, docker
    • None

    Description

      I am running a rocketized mesos agent.
      I am using the docker containerizer.
      My executors are dockerized.
      The very first time I deploy a sample platform I get some errors like the one below:

      Failed to launch container: Failed to run 'docker -H unix:///var/run/docker.sock inspect mesos-84a9df2b-be0e-459e-afc9-b95d4e8ced57-S0.0116a0a2-ccaf-4f1a-846c-361ec4e4a179': exited with status 1; stderr='Error: No such image, container or task: mesos-84a9df2b-be0e-459e-afc9-b95d4e8ced57-S0.0116a0a2-ccaf-4f1a-846c-361ec4e4a179 '
      

      But when I check with docker ps I can see the supposedly missing container and I can even successfully run docker inspect on it. Then marathon reschedules and I get a duplicate. Nor mesos neither marathon list any duplicate (only docker does).

      Restarting the mesos-agent wipes out the reported missing container leaving the other ones alive.

      When all my nodes have the docker image layers cached I can deploy the sample platform smoothly and I don't get the previous errors.

      If a container needs a remote volume attached (EBS via REX-Ray) the error happens all the time. No matter if cached or not.

      Reading the code I suspect it is related to the retryInterval of Docker::inspect https://github.com/apache/mesos/blob/2e013890e47c30053b7b83cd205b432376589216/src/docker/docker.cpp#L950-L952 but there is no option to modify this setting.

      Attachments

        Activity

          People

            Unassigned Unassigned
            h0tbird Marc Villacorta
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: