Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-6563

Shared Filesystem Isolator does not clean up mounts

    XMLWordPrintableJSON

Details

    Description

      While testing the agent's 'filesystem/shared' isolator we discovered that mounts are not unmounted, agents ended up with 1000s of mounts, one for each task that has run.

      To reproduce the problem start a mesos agent w/ --isolation="filesystem/shared" and --default_container_info="file:///tmp/the-container-info-below.json", then launch and kill several tasks. After the tasks are killed the mount points should be unmounted, but they are not.

      container info
      {
          "type": "MESOS",
          "volumes": [
              {
                  "container_path": "/tmp",
                  "host_path": "tmp",
                  "mode": "RW"
              }
          ]
      }
      

      Mounts are supposed to be cleaned automatically by the kernel when the process exits.

      // We only need to implement the `prepare()` function in this
      // isolator. There is nothing to recover because we do not keep any
      // state and do not monitor filesystem usage or perform any action on
      // cleanup. Cleanup of mounts is done automatically done by the kernel
      // when the mount namespace is destroyed after the last process
      // terminates.
      Future<Option<ContainerLaunchInfo>> SharedFilesystemIsolatorProcess::prepare(
          const ContainerID& containerId,
          const ContainerConfig& containerConfig)
      {
      

      We found during testing that an agent would have 1000s of dangling mounts, all of them attributed to the mesos agent:

      root[7]server-001 ~ # tail /proc/mounts
      /dev/sda1 /var/lib/mesos/slaves/b600ee92-bb38-4447-984a-4047c3d2c176-S2/frameworks/201103282247-0000000019-0000/executors/thermos-drobinson-test-sleep2-0-dda59747-848a-4b3b-8424-d0032f8a38f7/runs/e31bea31-22d7-4758-bc8b-6837919d7ed7/tmp xfs rw,noatime,attr2,nobarrier,inode64,prjquota 0 0
      /dev/sda1 /var/lib/mesos/slaves/b600ee92-bb38-4447-984a-4047c3d2c176-S2/frameworks/201103282247-0000000019-0000/executors/thermos-drobinson-test-sleep2-0-3a001926-a442-45c4-9cbc-dad182954fed/runs/bd0a8e36-d147-4511-9cc5-afff9f1c0fbe/tmp xfs rw,noatime,attr2,nobarrier,inode64,prjquota 0 0
      /dev/sda1 /var/lib/mesos/slaves/b600ee92-bb38-4447-984a-4047c3d2c176-S2/frameworks/201103282247-0000000019-0000/executors/thermos-drobinson-test-sleep2-0-04204a72-53d8-44a8-bac5-613835ff85a7/runs/967739ea-5284-41ed-af1a-1cb5a77dd690/tmp xfs rw,noatime,attr2,nobarrier,inode64,prjquota 0 0
      /dev/sda1 /var/lib/mesos/slaves/b600ee92-bb38-4447-984a-4047c3d2c176-S2/frameworks/201103282247-0000000019-0000/executors/thermos-drobinson-test-sleep2-0-95d1ac39-323a-4c15-b1dc-645ed79c4128/runs/6ff6d2b3-2867-4ad4-b2bb-20e27a0fa925/tmp xfs rw,noatime,attr2,nobarrier,inode64,prjquota 0 0
      /dev/sda1 /var/lib/mesos/slaves/b600ee92-bb38-4447-984a-4047c3d2c176-S2/frameworks/201103282247-0000000019-0000/executors/thermos-drobinson-test-sleep2-0-91f6a946-f560-43a3-95c2-424c5dd71684/runs/a4821acc-58f8-4457-bdc9-bd83bdeb8231/tmp xfs rw,noatime,attr2,nobarrier,inode64,prjquota 0 0
      /dev/sda1 /var/lib/mesos/slaves/b600ee92-bb38-4447-984a-4047c3d2c176-S2/frameworks/201103282247-0000000019-0000/executors/thermos-drobinson-test-sleep2-0-dd3b34f1-10c6-43d3-8741-a3164a642e93/runs/0ef8cf17-6c18-48a4-9943-66c448de5d44/tmp xfs rw,noatime,attr2,nobarrier,inode64,prjquota 0 0
      /dev/sda1 /var/lib/mesos/slaves/b600ee92-bb38-4447-984a-4047c3d2c176-S2/frameworks/201103282247-0000000019-0000/executors/thermos-drobinson-test-sleep2-0-fb704ef8-1cf9-4d35-854d-7b6247cf4bc2/runs/e65ec976-057f-4939-9053-1ddcddfc98f8/tmp xfs rw,noatime,attr2,nobarrier,inode64,prjquota 0 0
      /dev/sda1 /var/lib/mesos/slaves/b600ee92-bb38-4447-984a-4047c3d2c176-S2/frameworks/201103282247-0000000019-0000/executors/thermos-drobinson-test-sleep2-0-cdf7b06d-2265-41fe-b1e9-84366dc88b62/runs/1bed4289-7442-4a91-bf45-a7de10ab79bb/tmp xfs rw,noatime,attr2,nobarrier,inode64,prjquota 0 0
      /dev/sda1 /var/lib/mesos/slaves/b600ee92-bb38-4447-984a-4047c3d2c176-S2/frameworks/201103282247-0000000019-0000/executors/thermos-drobinson-test-sleep2-0-58582496-e551-4d80-8ae5-9eacac5e8a36/runs/6b5a7f56-af89-4eab-bbfa-883ca43744ad/tmp xfs rw,noatime,attr2,nobarrier,inode64,prjquota 0 0
      /dev/sda1 /var/lib/mesos/slaves/b600ee92-bb38-4447-984a-4047c3d2c176-S2/frameworks/201103282247-0000000019-0000/executors/thermos-drobinson-test-sleep2-0-5d6bc25a-6ba7-48f9-9655-85da6ff0a383/runs/d5cc4b31-7876-4bca-b1fa-b177c5d88bfc/tmp xfs rw,noatime,attr2,nobarrier,inode64,prjquota 0 0
      root[7]server-001 ~ # grep -c 'drobinson-test-sleep2' /proc/mounts
      4950
      root[7]server-001 ~ # pgrep -f /usr/local/bin/mesos-slave
      27799
      root[7]server-001 ~ # wc -l /proc/27799/mounts
      5079 /proc/27799/mounts
      root[7]server-001 ~ # grep -c 'drobinson-test-sleep2' /proc/27799/mounts
      4950
      root[7]server-001 ~ # ps auxww | grep 'drobinson-test-sleep2' -c
      5
      

      Attachments

        Issue Links

          Activity

            People

              ipronin Ilya
              drobinson Daniel Robinson
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: