Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-5799

docker::inspect() may get wrong output when a docker container is not in "running" state

    XMLWordPrintableJSON

    Details

      Description

      I (klueska) am copying the text from an email I got about a bug report from Yubo Li at IBM.

      docker::inspect() may get wrong output when the docker container is not in "running" state. In this case, the "docker inspect" will failed to parse data, and system can not enter TASK:RUNNING status.

      I attached related logs in stderr, I printed the docker inspect output. The inspected output shows that the docker is in "created" status, not "running", so that many of inspect fields are invalid.

      Possible Fix: detect the "State->Running" field, and get success return when "State->Running" is true.

      I0706 09:01:05.342895  2975 docker.cpp:780] Running docker -H unix:///var/run/docker.sock run --cpu-shares 512 --memory 536870912 -e MARATHON_APP_VERSION=2016-07-06T08:15:02.610Z -e HOST=9.186.57.67 -e MARATHON_APP_RESOURCE_CPUS=0.5 -e MARATHON_APP_RESOURCE_GPUS=1 -e MARATHON_APP_DOCKER_IMAGE=cuda_test_v0.1 -e PORT_10000=31435 -e MESOS_TASK_ID=ubuntu-gpu-32520.29f083bf-4358-11e6-b886-2ee1446b5607 -e PORT=31435 -e MARATHON_APP_RESOURCE_MEM=512.0 -e PORTS=31435 -e MARATHON_APP_RESOURCE_DISK=0.0 -e MARATHON_APP_LABELS= -e MARATHON_APP_ID=/ubuntu-gpu-32520 -e PORT0=31435 -e MESOS_SANDBOX=/mnt/mesos/sandbox -e MESOS_CONTAINER_NAME=mesos-1875c0d3-9712-43c3-9d58-572c89fac50b-S1.cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439 -v /var/run/mesos/slaves/1875c0d3-9712-43c3-9d58-572c89fac50b-S1/frameworks/aee07017-f8e6-4ed5-8008-b4ea3a090282-0000/executors/ubuntu-gpu-32520.29f083bf-4358-11e6-b886-2ee1446b5607/runs/cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439:/mnt/mesos/sandbox --net host --device=/dev/nvidiactl:/dev/nvidiactl:rwm --device=/dev/nvidia-uvm:/dev/nvidia-uvm:rwm --device=/dev/nvidia0:/dev/nvidia0:rwm --entrypoint /bin/sh --name mesos-1875c0d3-9712-43c3-9d58-572c89fac50b-S1.cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439 cuda_test_v0.1 -c nvidia-smi && sleep 60s
      I0706 09:01:05.345935  2975 docker.cpp:943] Running docker -H unix:///var/run/docker.sock inspect mesos-1875c0d3-9712-43c3-9d58-572c89fac50b-S1.cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439
      I0706 09:01:05.548992  2976 docker.cpp:249] Docker inspect: [
      {
          "Id": "5a4dc17e739b60593c04abf310f2485dddea832476e83007387b612839933f5a",
          "Created": "2016-07-06T09:01:05.531216924Z",
          "Path": "/bin/sh",
          "Args": [
              "-c",
              "nvidia-smi \u0026\u0026 sleep 60s"
          ],
          "State": {
              "Status": "created",
              "Running": false,
              "Paused": false,
              "Restarting": false,
              "OOMKilled": false,
              "Dead": false,
              "Pid": 0,
              "ExitCode": 0,
              "Error": "",
              "StartedAt": "0001-01-01T00:00:00Z",
              "FinishedAt": "0001-01-01T00:00:00Z"
          },
          "Image": "8cf6c8da7045ec24b1e561906dfa54ab0276753ec617e139a7b2da3ef72d245e",
          "ResolvConfPath": "",
          "HostnamePath": "",
          "HostsPath": "",
          "LogPath": "",
          "Name": "/mesos-1875c0d3-9712-43c3-9d58-572c89fac50b-S1.cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439",
          "RestartCount": 0,
          "Driver": "aufs",
          "ExecDriver": "native-0.2",
          "MountLabel": "",
          "ProcessLabel": "",
          "AppArmorProfile": "",
          "ExecIDs": null,
          "HostConfig": {
              "Binds": null,
              "ContainerIDFile": "",
              "LxcConf": null,
              "Memory": 0,
              "MemoryReservation": 0,
              "MemorySwap": 0,
              "KernelMemory": 0,
              "CpuShares": 0,
              "CpuPeriod": 0,
              "CpusetCpus": "",
              "CpusetMems": "",
              "CpuQuota": 0,
              "BlkioWeight": 0,
              "OomKillDisable": false,
              "MemorySwappiness": null,
              "Privileged": false,
              "PortBindings": null,
              "Links": null,
              "PublishAllPorts": false,
              "Dns": null,
              "DnsOptions": null,
              "DnsSearch": null,
              "ExtraHosts": null,
              "VolumesFrom": null,
              "Devices": null,
              "NetworkMode": "",
              "IpcMode": "",
              "PidMode": "",
              "UTSMode": "",
              "CapAdd": null,
              "CapDrop": null,
              "GroupAdd": null,
              "RestartPolicy": {
                  "Name": "",
                  "MaximumRetryCount": 0
              },
              "SecurityOpt": null,
              "ReadonlyRootfs": false,
              "Ulimits": null,
              "LogConfig": {
                  "Type": "json-file",
                  "Config": {}
              },
              "CgroupParent": "",
              "ConsoleSize": [
                  0,
                  0
              ],
              "VolumeDriver": ""
          },
          "GraphDriver": {
              "Name": "aufs",
              "Data": null
          },
          "Mounts": [],
          "Config": {
              "Hostname": "5a4dc17e739b",
              "Domainname": "",
              "User": "",
              "AttachStdin": false,
              "AttachStdout": true,
              "AttachStderr": true,
              "Tty": false,
              "OpenStdin": false,
              "StdinOnce": false,
              "Env": [
                  "MARATHON_APP_VERSION=2016-07-06T08:15:02.610Z",
                  "HOST=9.186.57.67",
                  "MARATHON_APP_RESOURCE_CPUS=0.5",
                  "MARATHON_APP_RESOURCE_GPUS=1",
                  "MARATHON_APP_DOCKER_IMAGE=cuda_test_v0.1",
                  "PORT_10000=31435",
                  "MESOS_TASK_ID=ubuntu-gpu-32520.29f083bf-4358-11e6-b886-2ee1446b5607",
                  "PORT=31435",
                  "MARATHON_APP_RESOURCE_MEM=512.0",
                  "PORTS=31435",
                  "MARATHON_APP_RESOURCE_DISK=0.0",
                  "MARATHON_APP_LABELS=",
                  "MARATHON_APP_ID=/ubuntu-gpu-32520",
                  "PORT0=31435",
                  "MESOS_SANDBOX=/mnt/mesos/sandbox",
                  "MESOS_CONTAINER_NAME=mesos-1875c0d3-9712-43c3-9d58-572c89fac50b-S1.cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439",
                  "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
              ],
              "Cmd": [
                  "-c",
                  "nvidia-smi \u0026\u0026 sleep 60s"
              ],
              "Image": "cuda_test_v0.1",
              "Volumes": null,
              "WorkingDir": "",
              "Entrypoint": [
                  "/bin/sh"
              ],
              "OnBuild": null,
              "Labels": {},
              "StopSignal": "SIGTERM"
          },
          "NetworkSettings": {
              "Bridge": "",
              "SandboxID": "",
              "HairpinMode": false,
              "LinkLocalIPv6Address": "",
              "LinkLocalIPv6PrefixLen": 0,
              "Ports": null,
              "SandboxKey": "",
              "SecondaryIPAddresses": null,
              "SecondaryIPv6Addresses": null,
              "EndpointID": "",
              "Gateway": "",
              "GlobalIPv6Address": "",
              "GlobalIPv6PrefixLen": 0,
              "IPAddress": "",
              "IPPrefixLen": 0,
              "IPv6Gateway": "",
              "MacAddress": "",
              "Networks": null
          }
      }
      ]
      I0706 09:01:05.549659  2976 docker.cpp:335] Unable to detect IP Address at 'NetworkSettings.Networks..IPAddress', attempting deprecated field
      WARNING: Your kernel does not support swap limit capabilities, memory limited without swap.
      I0706 09:01:52.983609  2973 exec.cpp:486] Agent exited, but framework has checkpointing enabled. Waiting 15mins to reconnect with agent 1875c0d3-9712-43c3-9d58-572c89fac50b-S1
      I0706 09:02:06.057607  2978 exec.cpp:549] Executor sending status update TASK_FINISHED (UUID: 2cff35f2-9512-4120-b912-74a82c197696) for task ubuntu-gpu-32520.29f083bf-4358-11e6-b886-2ee1446b5607 of framework aee07017-f8e6-4ed5-8008-b4ea3a090282-0000
      I0706 09:02:06.058717  2980 poll_socket.cpp:131] Socket error while connecting
      I0706 09:02:06.058815  2980 process.cpp:1799] Failed to send 'mesos.internal.StatusUpdateMessage' to '127.0.1.1:5051', connect: Socket error while connecting
      E0706 09:02:06.058931  2980 process.cpp:2104] Failed to shutdown socket with fd 6: Transport endpoint is not connected
      

        Attachments

          Activity

            People

            • Assignee:
              bmahler Benjamin Mahler
              Reporter:
              klueska Kevin Klues
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: