plan9 semantics on Linux - mount namespaces

Fri Feb 16 18:26:59 UTC 2018

Enrico Weigelt <lkml at metux.net> writes:

> On 13.02.2018 22:12, Enrico Weigelt wrote:
>
> CC @containers at lists.linux-foundation.org
>
>> Hi folks,
>>
>>
>> I'm currently trying to implement plan9 semantics on Linux and
>> yet sorting out how to do the mount namespace handling.
>>
>> On plan9, any unprivileged process can create its own namespace
>> and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
>>
>> What is the reason for not allowing arbitrary users to create their
>> own private mount namespace ? What could go wrong here ?

suid root executables could be fooled.  An easy case is fooling
/bin/su into reading a different copy of /etc/shadow, and allowing
arbitrary changes between users.

>> IMHO, we could allow mount/bind under the following conditions:
>>
>> * the process is in a private mount namespace
>> * no suid-flag is honored (either force all mounts to nosuid or
>>    completely mask it out)
>> * only certain whitelisted filesystems allowed (eg. 9P and FUSE)
>>
>> Maybe that all could be enabled by a new capability.
>>
>>
>> any suggestions ?

User namespaces limit the contained processes to not having any
permissions outside of the user namespace.  While still allowing the
fully unix permission model inside user namespaces.

I am in the final stages of getting the changes in the vfs and in fuse
to allow unprivileged users to mount that filesystem.  plan9fs would
also be a candidate for that kind of treatment if it had a maintainer.

Eric