Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
Ubuntu 15
Or any os with kernel 4.0+
Description
This issue is exposed when testing unified containerizer with overlayfs backend using any image with numerous layers (e.g., 38 layers). It can be reproduced by using this image: `gilbertsong/cirros:34` (for anyone who wants to test it out).
Here is the partial log:
I0805 21:50:02.631873 11136 provisioner.cpp:315] Provisioning image rootfs '/tmp/provisioner/containers/36c69ade-69db-4de3-9cd4-18b9b9c99e73/backends/overlay/rootfses/ba255b76-8326-4611-beb5-002f202b52e0' for container 36c69ade-69db-4de3-9cd4-18b9b9c99e73 using overlay backend I0805 21:50:02.632990 11138 overlay.cpp:156] Provisioning image rootfs with overlayfs: 'lowerdir=/tmp/mesos/store/docker/layers/0b3552c520cda8ec7b81c0245f62e14dfb5214b7dce4da70d4124c19b64c70b9/rootfs:/tmp/mesos/store/docker/layers/dcdb76907cb758920f4eaabc338a9bf229be790a184bdd1e963480a03a7eacfa/rootfs:/tmp/mesos/store/docker/layers/c562a889ec2700b07f1bfb00c8de7f35568420b62d1e8160962628fcb9852f32/rootfs:/tmp/mesos/store/docker/layers/e27aafe45078f82cd69baa397b72ecfb4e8778040bfd8241aa0f4189612f294e/rootfs:/tmp/mesos/store/docker/layers/f40f6d4dc7496d9936ba9c2c1aa5a28a0b8b08f58eaeeec7f17330926f0acd8f/rootfs:/tmp/mesos/store/docker/layers/4e73c54df43c79d944a7b9d365f73464e547a857ad723aae285f9803c506a99f/rootfs:/tmp/mesos/store/docker/layers/0381bc1361243e9e0adf522135e31d85edeb837948985d4a6cf37ba6af21f2c7/rootfs:/tmp/mesos/store/docker/layers/8c4a4d5185324d29d1e4b36d8178842f4bcfcc7cc264666ab1b355668adfc97f/rootfs:/tmp/mesos/store/docker/layers/56157927e47e4774f858d3706262dc2e5921be0e7d0ceb741645513746fdedea/rootfs:/tmp/mesos/store/docker/layers/630c68a1627d8f6582569cc008f9a06b893fa7894dc290635dd454b00e894873/rootfs:/tmp/mesos/store/docker/layers/82273458148226630bbea90cf12b72cdc867faf152049361d1e97c8a426ae009/rootfs:/tmp/mesos/store/docker/layers/7fb31183c817b9bc0db5697d70753df4b1bf8e1012cd8c834931b595d846ab54/rootfs:/tmp/mesos/store/docker/layers/31c4f23aaccfd222b73622bfef533b52912f19e7569a568f7d58d40f645bcd86/rootfs:/tmp/mesos/store/docker/layers/16896c1cea9f9c911668eef2ad0af8aa2db689c27127169880e1df75d5a9151b/rootfs:/tmp/mesos/store/docker/layers/8a9f03cff6171de90b2fe6e00d00b17993f8811814be4e91b0da1ae55dfa616d/rootfs:/tmp/mesos/store/docker/layers/5fb7fd9fb5b0fdde1bd2f8b071b23f8ae8c0a685056a40fd22dbe88f37a4fde9/rootfs:/tmp/mesos/store/docker/layers/64988a98c6a682fef16bd69e3d48cc49024d1c0f6526c4b21169fa3f81dc7d60/rootfs:/tmp/mesos/store/docker/layers/253759d741f48d5741b14f3e4d19ea165f326b15ec404fcc0d4741c274d0af29/rootfs:/tmp/mesos/store/docker/layers/5f2b648ae86db5bfc8f2b01739fd561325d91a7f905f6599032b78065ba929fa/rootfs:/tmp/mesos/store/docker/layers/700018f2c4c21668e0935aae9edc09f0f5df72ca2e58c0cdf5d61313018f3528/rootfs:/tmp/mesos/store/docker/layers/99016394fafebd1dad47724121998aecf0782da93eedc9bd9d6d2af478a798a4/rootfs:/tmp/mesos/store/docker/layers/9a711ed91d6a74f0c4d5e7ea1e44c9e3d0e90e3083e889625eb765acddfd4ea6/rootfs:/tmp/mesos/store/docker/layers/d9c00b1f35232ab21f2ac182194acd381ec096dc8c25c4d40b2e84695e2d6b91/rootfs:/tmp/mesos/store/docker/layers/10e9d3ad1d49d649a63536a227b8f93e8dd8f0bcde1ab127f0c62da26ea09469/rootfs:/tmp/mesos/store/docker/layers/819293665a9f634bf2e149b2441ee82ddc74d38e7a6d0c90491bffe5e6b5ae22/rootfs:/tmp/mesos/store/docker/layers/a0ed5b96a63de8623f77e7107b888f2945fcf069dd4440f3cafd13de408a8fb9/rootfs:/tmp/mesos/store/docker/layers/2756be24c0982a13a523a5ce04535578c27f00fc3a77321dfdb537ea5d323470/rootfs:/tmp/mesos/store/docker/layers/b820bc0393598343b8f05e6e61b899e00ee1e72cfce9b70dd04d004794ca02a6/rootfs:/tmp/mesos/store/docker/layers/8245da6b1667e1b5aac028f6729620459595e7148340d4db6a9f912cda7523a1/rootfs:/tmp/mesos/store/docker/layers/87886e37285d0182cfb4f83dec9239ce6cc094e699a6de3c4507789ec6a80870/rootfs:/tmp/mesos/store/docker/layers/8568fa3ad8b47e7565a9833b2950d023cf82558b40a0508ed155ebe71e8fa8b2/rootfs:/tmp/mesos/store/docker/layers/98986dcc611643e2291913352f0f2df37ac5b068072b7f1d01ed87532cba4f23/rootfs:/tmp/mesos/store/docker/layers/b96b0a4229bbb38fc20da48f539c8473fa255fd42282d97ac4de071342c57c58/rootfs:/tmp/mesos/store/docker/layers/2b9fd04b9d5a26be9cc150f408657c553ed9479a43ff60c0bbf8f586c3dfd1e9/rootfs:/tmp/mesos/store/docker/layers/0d27f8e693fb23b476ae409bd008492a92b355aa3ac10cf536dabd458758af55/rootfs:/tmp/mesos/store/docker/layers/500e7eced838c4822a111abdb64fce8e7f3c0ecaf3d47157331b0cd30ebac4dc/rootfs:/tmp/mesos/store/docker/layers/c42d375c72b4e709bc0eeda368591277fa73836dfd5597fe98e2524c8587536e/rootfs:/tmp/mesos/store/docker/layers/34fa5867b8b0888ea3b718df9ad2925b8f7f50b6583b7cbdfabd826bfe5c6de8/rootfs:/tmp/mesos/store/docker/layers/3690474eb5b4b26fdfbd89c6e159e8cc376ca76ef48032a30fa6aafd56337880/rootfs,upperdir=/tmp/provisioner/containers/36c69ade-69db-4de3-9cd4-18b9b9c99e73/backends/overlay/scratch/ba255b76-8326-4611-beb5-002f202b52e0/upperdir,workdir=/tmp/provisioner/containers/36c69ade-69db-4de3-9cd4-18b9b9c99e73/backends/overlay/scratch/ba255b76-8326-4611-beb5-002f202b52e0/workdir' E0805 21:50:02.634330 11138 slave.cpp:4029] Container '36c69ade-69db-4de3-9cd4-18b9b9c99e73' for executor 'test' of framework 7807e8fb-2265-44cb-ac0a-a8cbc969784b-0000 failed to start: Failed to mount rootfs '/tmp/provisioner/containers/36c69ade-69db-4de3-9cd4-18b9b9c99e73/backends/overlay/rootfses/ba255b76-8326-4611-beb5-002f202b52e0' with overlayfs: Invalid argument I0805 21:50:02.635247 11134 containerizer.cpp:1637] Destroying container '36c69ade-69db-4de3-9cd4-18b9b9c99e73' I0805 21:50:02.635674 11134 containerizer.cpp:1640] Waiting for the provisioner to complete for container '36c69ade-69db-4de3-9cd4-18b9b9c99e73' I0805 21:50:02.637307 11135 provisioner.cpp:458] Destroying container rootfs at '/tmp/provisioner/containers/36c69ade-69db-4de3-9cd4-18b9b9c99e73/backends/overlay/rootfses/ba255b76-8326-4611-beb5-002f202b52e0' for container 36c69ade-69db-4de3-9cd4-18b9b9c99e73
The root cause is the overlayfs does not allow its option string to be too long when mounting the overlay. Most likely it is limited as 4kb.
After investigating with zhitao, if users want to use their custom large(numerous layers) image with unified containerizer with overlayfs backend, the image layer number should be approximately less or equal than:
4096 / (80 + len(flgs.docker_store_dir))
So for most cases, this number should be 34 or 35 if using the default docker store dir.
We found out two workarounds for this issue:
1. Keep your `--docker_store` as short as possible, e.g., `/tmp` or `/t`.
2. Squash your large image to less layers.
Ideally it would be perfect if this issue can be resolved in the kernel by simply increase the overlayfs option para size. But considering it may take long, here is something we can do in Mesos code base:
According to the workaround in docker (thanks zhitao), https://docs.docker.com/engine/userguide/storagedriver/overlayfs-driver/
Firstly calculate the size of the option to determine whether or not shorten symlinks are needed. If yes, then a shorten symlink can be created to make the overlay mounting option much shorter. To achieve this, we should do something different from docker, which will make it simpler:
1. mktemp under /tmp.
2. create symlink `1`, `2`, ... , `n` for each layers.
3. carry these info using backenddir in overlayfs.
4. remove these symlinks in destroy.
Attachments
Issue Links
- is related to
-
MESOS-6001 Aufs backend cannot support the image with numerous layers.
- Resolved
-
MESOS-5931 Support auto backend in Unified Containerizer.
- Resolved
Another suggestion: until this gets fixed, maybe we can detect these very long option and cowardly refuse to provision the image upfront with a clear message?