Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-8418

mesos-agent high cpu usage because of numerous /proc/mounts reads

    XMLWordPrintableJSON

Details

    Description

      /proc/mounts is read many, many times from src/(linux/fs|linux/cgroups|slave/slave).cpp.

      When using overlayfs, the /proc/mounts contents can become quite large.
      As an example, one of our Q/A single node running ~150 tasks, have a 361 lines/ 201299 chars /proc/mounts file.

      This 200kB file is read on this node about 25 to 150 times per second. This is a (huge) waste of cpu and I/O time.

      Most of these calls are related to cgroups.

      Please consider these proposals :

      1/ Is /proc/mounts mandatory for cgroups ?
      We already have cgroup subsystems list from /proc/cgroups.
      The only compelling information from /proc/mounts seems to be the root mount point,
      /sys/fs/cgroup/, which could be obtained by a unique read on agent start.

      2/ use /proc/self/mountstats

      wc /proc/self/mounts /proc/self/mountstats
      361 2166 201299 /proc/self/mounts
      361 2888 50200 /proc/self/mountstats
      
      grep cgroup /proc/self/mounts
      cgroup /sys/fs/cgroup tmpfs rw,relatime,mode=755 0 0
      cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset 0 0
      cgroup /sys/fs/cgroup/cpu cgroup rw,relatime,cpu 0 0
      cgroup /sys/fs/cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0
      cgroup /sys/fs/cgroup/blkio cgroup rw,relatime,blkio 0 0
      cgroup /sys/fs/cgroup/memory cgroup rw,relatime,memory 0 0
      cgroup /sys/fs/cgroup/devices cgroup rw,relatime,devices 0 0
      cgroup /sys/fs/cgroup/freezer cgroup rw,relatime,freezer 0 0
      cgroup /sys/fs/cgroup/net_cls cgroup rw,relatime,net_cls 0 0
      cgroup /sys/fs/cgroup/perf_event cgroup rw,relatime,perf_event 0 0
      cgroup /sys/fs/cgroup/net_prio cgroup rw,relatime,net_prio 0 0
      cgroup /sys/fs/cgroup/pids cgroup rw,relatime,pids 0 0
      
      grep cgroup /proc/self/mountstats
      device cgroup mounted on /sys/fs/cgroup with fstype tmpfs
      device cgroup mounted on /sys/fs/cgroup/cpuset with fstype cgroup
      device cgroup mounted on /sys/fs/cgroup/cpu with fstype cgroup
      device cgroup mounted on /sys/fs/cgroup/cpuacct with fstype cgroup
      device cgroup mounted on /sys/fs/cgroup/blkio with fstype cgroup
      device cgroup mounted on /sys/fs/cgroup/memory with fstype cgroup
      device cgroup mounted on /sys/fs/cgroup/devices with fstype cgroup
      device cgroup mounted on /sys/fs/cgroup/freezer with fstype cgroup
      device cgroup mounted on /sys/fs/cgroup/net_cls with fstype cgroup
      device cgroup mounted on /sys/fs/cgroup/perf_event with fstype cgroup
      device cgroup mounted on /sys/fs/cgroup/net_prio with fstype cgroup
      device cgroup mounted on /sys/fs/cgroup/pids with fstype cgroup
      

      This file contains all the required information, and is 4x smaller

      3/ microcaching
      Caching cgroups data for just 1 second would be a huge perfomance improvement, but i'm not aware of the possible side effects.

      Attachments

        1. mesos-agent-flamegraph.png
          422 kB
          Stephan Erb
        2. mesos-agent.stacks.gz
          233 kB
          Stephan Erb
        3. image-2018-08-06-13-49-03-317.png
          71 kB
          Stephan Erb
        4. image-2018-08-06-13-49-03-241.png
          91 kB
          Stephan Erb

        Issue Links

          Activity

            People

              bmahler Benjamin Mahler
              kaalh Stéphane Cottin
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: