Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-8480

Mesos returns high resource usage when killing a Docker task.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.2, 1.5.0, 1.6.0
    • Component/s: containerization
    • Labels:
      None
    • Target Version/s:
    • Sprint:
      Mesosphere Sprint 73
    • Story Points:
      2

      Description

      The way we get resource statistics for Docker tasks is through getting the cgroup subsystem path through /proc/<pid>/cgroup first (taking the cpuacct subsystem as an example):

      9:cpuacct,cpu:/docker/66fbe67b64ad3a86c6e080e18578bc9e540e55ee0bdcae09c2e131a4264a3a3b
      

      Then read /sys/fs/cgroup/cpuacct//docker/66fbe67b64ad3a86c6e080e18578bc9e540e55ee0bdcae09c2e131a4264a3a3b/cpuacct.stat to get the statistics:

      user 4
      system 0
      

      However, when a Docker container is being teared down, it seems that Docker or the operation system will first move the process to the root cgroup before actually killing it, making /proc/<pid>/docker look like the following:

      9:cpuacct,cpu:/
      

      This makes a racy call to cgroup::internal::cgroup() return a single '/', which in turn makes DockerContainerizerProcess::cgroupsStatistics() read /sys/fs/cgroup/cpuacct///cpuacct.stat, which contains the statistics for the root cgroup:

      user 228058750
      system 24506461
      

      This can be reproduced by test.cpp with the following command:

      $ docker run --name sleep -d --rm alpine sleep 1000; ./test $(docker inspect sleep | jq .[].State.Pid) & sleep 1 && docker rm -f sleep
      ...
      
      Reading file '/proc/44224/cgroup'
      Reading file '/sys/fs/cgroup/cpuacct//docker/1d79a6c877e2af3081630aa57d23d853e6bd7d210dad28f897556bfea20bc9c1/cpuacct.stat'
      user 4
      system 0
      
      Reading file '/proc/44224/cgroup'
      Reading file '/sys/fs/cgroup/cpuacct///cpuacct.stat'
      user 228058750
      system 24506461
      
      Reading file '/proc/44224/cgroup'
      Reading file '/sys/fs/cgroup/cpuacct///cpuacct.stat'
      user 228058750
      system 24506461
      
      Failed to open file '/proc/44224/cgroup'
      sleep
      [2]-  Exit 1                  ./test $(docker inspect sleep | jq .[].State.Pid)
      

        Attachments

        1. test.cpp
          2 kB
          Chun-Hung Hsiao

          Activity

            People

            • Assignee:
              chhsia0 Chun-Hung Hsiao
              Reporter:
              chhsia0 Chun-Hung Hsiao
              Shepherd:
              Jie Yu
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: