For review: user_namespace(7) man page

Eric W. Biederman ebiederm at xmission.com
Tue Sep 9 15:49:34 UTC 2014


"Michael Kerrisk (man-pages)" <mtk.manpages at gmail.com> writes:

> Hi Eric,
>
> On 08/30/2014 02:53 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages at gmail.com> writes:
>> 
>>> Hello Eric et al.,
>>>
>>> For various reasons, my work on the namespaces man pages 
>>> fell off the table a while back. Nevertheless, the pages have
>>> been close to completion for a while now, and I recently restarted,
>>> in an effort to finish them. As you also noted to me f2f, there have
>>> been recently been some small namespace changes that you may affect
>>> the content of the pages. Therefore, I'll take the opportunity to
>>> send the namespace-related pages out for further (final?) review.
>>>
>>> So, here, I start with the user_namespaces(7) page, which is shown 
>>> in rendered form below, with source attached to this mail. I'll
>>> send various other pages in follow-on mails.
>>>
>>> Review comments/suggestions for improvements / bug fixes welcome.
>>>
>>> Cheers,
>>>
>>> Michael
>>>
>>> ==
>>>
>>> NAME
>>>        user_namespaces - overview of Linux user_namespaces
>>>
> [...]
>
>>>        When a new IPC, mount, network, PID, or UTS namespace is  created
>>>        via clone(2) or unshare(2), the kernel records the user namespace
>>>        of the creating process against the new namespace.  (This associ‐
>>>        ation  can't  be  changed.)   When a process in the new namespace
>>>        subsequently  performs  privileged  operations  that  operate  on
>>>        global resources isolated by the namespace, the permission checks
>>>        are performed according to the process's capabilities in the user
>>>        namespace that the kernel associated with the new namespace.
>> 
>> Restrictions on mount namespaces.
>> 
>> - A mount namespace has a owner user namespace.  A mount namespace whose
>>   owner user namespace is different than the owerner user namespace of
>>   it's parent mount namespace is considered a less privileged mount
>>   namespace.
>> 
>> - When creating a less privileged mount namespace shared mounts are
>>   reduced to slave mounts.  This ensures that mappings performed in less
>>   privileged mount namespaces will not propogate to more privielged
>>   mount namespaces.
>> 
>> - Mounts that come as a single unit from more privileged mount are
>>   locked together and may not be separated in a less privielged mount
>>   namespace.
>
> Could you clarify what you mean by "Mounts that come as a single
> unit"?

unshare(CLONE_NEWNS) brings across all of the mounts from the original
mount namespace as a single unit.

recursive mounts that propogate between mount namespaces propogate as a
single unit.

The importance of this is allow the global root to mount over things
and not have to worry that someone from a user namespace root can
peek underneath.

>> - The mount flags readonly, nodev, nosuid, noexec, and the mount atime
>>   settings when propogated from a more privielged to a less privileged
>>   mount namespace become locked, and may not be changed in the less
>>   privielged mount namespace.
>> 
>> - (As of 3.18-rc1 (in todays Al Viros vfs.git#for-next tree)) A file or
>>   directory that is a mountpoint in one namespace that is not a mount
>>   point in another namespace, may be renamed, unlinked, or rmdired in
>>   the mount namespace in which it is not a mount namespace if the
>>   ordinary permission checks pass.
>> 
>>   Previously attemping to rmdir, unlink or rename a file or directory
>>   that was a mount point in another mount namespace would result in
>>   -EBUSY.  This behavior had technical problems of enforcement (nfs)
>>   and resulted in a nice denial of servial attack against more
>>   privileged users.  (Aka preventing individual files from being updated
>>   by bind mounting on top of them).
>
> I have reworked the text above a little so that now we have the following.
> Aside from question above, does it look okay?
>
>    Restrictions on mount namespaces
>        Note the following points with respect to mount namespaces:
>
>        *  A  mount  namespace  has  na  owner user namespace.  A mount
                                     ^s/na/an/
>           namespace whose owner user namespace is different  from  the
>           owner  user  namespace of its parent mount namespace is con‐
>           sidered a less privileged mount namespace.
>
>        *  When creating a  less  privileged  mount  namespace,  shared
>           mounts  are reduced to slave mounts.  This ensures that map‐
>           pings performed in less privileged mount namespaces will not
>           propagate to more privileged mount namespaces.
>
>        *  Mounts that come as a single unit from more privileged mount
            ^ namespaces
>           are locked together and may not be separated in a less priv‐
>           ileged mount namespace.
>
>        *  The  mount(2) flags MS_RDONLY, MS_NOSUID, MS_NOEXEC, and the
>           "atime" flags (MS_NOATIME, MS_NODIRATIME, MS_RELATIME)  set‐
>           tings  become  locked when propagated from a more privileged
>           to a less privileged mount namespace, and may not be changed
>           in the less privileged mount namespace.
>
>        *  A  file  or directory that is a mount point in one namespace
>           that is not a mount  point  in  another  namespace,  may  be
>           renamed, unlinked, or removed (rmdir(2)) in the mount names‐
>           pace in which it is not a mount point (subject to the  usual
>           permission checks).
>
>           Previously,  attempting  to unlink, rename, or remove a file
>           or directory that was a mount point in another mount  names‐
>           pace  would  result  in  the error EBUSY.  That behavior had
>           technical problems of enforcement (e.g., for NFS)  and  per‐
>           mitted  denial-of-service  attacks  against  more privileged
>           users.   (i.e.,  preventing  individual  files  from   being
>           updated by bind mounting on top of them).

Subject to tiny typo corrections that looks fine.

Eric


More information about the Containers mailing list