Syscall 'pivot_root' requires that the old and the new root are not in the same filesystem. Otherwise, the user will receive a "Device or resource busy" error.
Currently, we rely on the provisioner to prepare the rootfs and do proper bind mount if needed so that pivot_root can succeed. The drawback of this approach is that it potentially pollutes the host mount table which requires cleanup logics.
For instance, in the test, we create a test rootfs by copying the host files. We need to do a self bind mount so that we can pivot_root on it. That pollute the host mount table and it might leak mounts if test crashes before we do the lazy umount:
What I propose is that we always perform a recursive self bind mount of rootfs itself in fs::chroot::enter (after enter the new mount namespace). Seems that this is also done in libcontainer: