Regression wrt mounting /proc in user namespace in 3.13

Serge E. Hallyn serge at hallyn.com
Sat Nov 16 16:48:40 UTC 2013


Quoting Daniel P. Berrange (berrange at redhat.com):
> Just testing libvirt with user namespaces on current Fedora rawhide
> 3.13.0-0.rc0.git3.2.fc21.x86_64 kernel, I'm now getting an error when
> we attempt to mount /proc

Thanks, I saw the same thing with 3.12 on friday afternoon, and decided
I must be too haggard from a week of unrelated work to think straight.

This definately will be a problem, making user namespace unusable for
containers.

>   # virsh -c lxc:/// start shell
>   error: Failed to start domain shell
>   error: internal error: guest failed to start: Failed to mount proc on /proc type proc flags=e: Operation not permitted
> 
> The syscall failing is
> 
>   mount("proc", "/proc", "proc", MS_NOSUID|MS_NODEV|MS_NOEXEC, NULL) = -1 EPERM (Operation not permitted)
> 
> 
> On the host OS the default Fedora environment has the following mounts
> present
> 
>   # grep /proc /proc/mounts 
>   proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
>   systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=41,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0
>   binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
>   sunrpc /proc/fs/nfsd nfsd rw,relatime 0 0
> 
>   # ls /proc/fs/nfsd/
>   export_features  filehandle      nfsv4gracetime  nfsv4recoverydir  pool_threads  reply_cache_stats        threads            unlock_ip
>   exports          max_block_size  nfsv4leasetime  pool_stats        portlist      supported_krb5_enctypes  unlock_filesystem  versions
> 
>   # ls /proc/sys/fs/binfmt_misc/
>   qemu-alpha  qemu-cris        qemu-microblazeel  qemu-mips64el  qemu-ppc64       qemu-sh4    qemu-sparc32plus  status
>   qemu-arm    qemu-m68k        qemu-mips          qemu-mipsel    qemu-ppc64abi32  qemu-sh4eb  qemu-sparc64
>   qemu-armeb  qemu-microblaze  qemu-mips64        qemu-ppc       qemu-s390x       qemu-sparc  register
> 
> 
> Only if I umount both of the /proc/sys/fs/binfmt_misc/ entries
> am I able to get past this EPERM error code.
> 
> Looking at GIT history I see this change as a likely candidate for
> something which has changed in this area:
> 
>   commit e51db73532955dc5eaba4235e62b74b460709d5b
>   Author: Eric W. Biederman <ebiederm at xmission.com>
>   Date:   Sat Mar 30 19:57:41 2013 -0700
> 
>     userns: Better restrictions on when proc and sysfs can be mounted
>     
>     Rely on the fact that another flavor of the filesystem is already
>     mounted and do not rely on state in the user namespace.
>     
>     Verify that the mounted filesystem is not covered in any significant
>     way.  I would love to verify that the previously mounted filesystem
>     has no mounts on top but there are at least the directories
>     /proc/sys/fs/binfmt_misc and /sys/fs/cgroup/ that exist explicitly
>     for other filesystems to mount on top of.
>     
>     Refactor the test into a function named fs_fully_visible and call that
>     function from the mount routines of proc and sysfs.  This makes this
>     test local to the filesystems involved and the results current of when
>     the mounts take place, removing a weird threading of the user
>     namespace, the mount namespace and the filesystems themselves.
>     
>     Signed-off-by: "Eric W. Biederman" <ebiederm at xmission.com>
> 
> 
> My guess is fs_fully_visible() is returning false, and thus causing the
> proc_mount() call to return EPERM, but I'm unclear why this would happen,
> or if this is indeed a correct hypothesis.
> 
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
> _______________________________________________
> Containers mailing list
> Containers at lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers


More information about the Containers mailing list