Documenting the ioctl interfaces to discover relationships between namespaces

Michael Kerrisk (man-pages) mtk.manpages at
Wed Dec 14 07:32:45 UTC 2016

On 12/12/2016 07:18 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages at> writes:
>> On 12/11/2016 11:30 PM, Eric W. Biederman wrote:
>>> "Michael Kerrisk (man-pages)" <mtk.manpages at> writes:
>>>> [was: [PATCH 0/4 v3] Add an interface to discover relationships
>>>> between namespaces]
>>> One small comment below.
>>>>    Introspecting namespace relationships
>>>>        Since Linux 4.9, two ioctl(2) operations  are  provided  to  allow
>>>>        introspection  of  namespace relationships (see user_namespaces(7)
>>>>        and pid_namespaces(7)).  The form of the calls is:
>>>>            ioctl(fd, request);
>>>>        In each case, fd refers to a /proc/[pid]/ns/* file.
>>>>        NS_GET_USERNS
>>>>               Returns a file descriptor that refers to  the  owning  user
>>>>               namespace for the namespace referred to by fd.
>>>>        NS_GET_PARENT
>>>>               Returns  a file descriptor that refers to the parent names‐
>>>>               pace of the namespace referred to by fd.  This operation is
>>>>               valid  only for hierarchical namespaces (i.e., PID and user
>>>>               namespaces).  For user namespaces, NS_GET_PARENT is synony‐
>>>>               mous with NS_GET_USERNS.
>>>>        In each case, the returned file descriptor is opened with O_RDONLY
>>>>        and O_CLOEXEC (close-on-exec).
>>>>        By applying fstat(2) to the returned file descriptor, one  obtains
>>>>        a  stat structure whose st_ino (inode number) field identifies the
>>>>        owning/parent namespace.  This inode number can  be  matched  with
>>>>        the  inode  number  of  another  /proc/[pid]/ns/{pid,user} file to
>>>>        determine whether that is the owning/parent namespace.
>>> Like all fstat inode comparisons to be fully accurate you need to
>>> compare both the st_ino and st_dev.  I reserve the right for st_dev to
>>> be significant when comparing namespaces.  Otherwise I might have to
>>> create a namespace of namespaces someday and that is ugly.
>>>>        Either of these ioctl(2) operations can fail  with  the  following
>>>>        error:
>>>>        EPERM  The  requested  namespace is outside of the caller's names‐
>>>>               pace scope.  This error can occur if, for example, the own‐
>>>>               ing  user  namespace is an ancestor of the caller's current
>>>>               user namespace.  It can also occur on  attempts  to  obtain
>>>>               the parent of the initial user or PID namespace.
>>>>        Additionally,  the  NS_GET_PARENT operation can fail with the fol‐
>>>>        lowing error:
>>>>        EINVAL fd refers to a nonhierarchical namespace.
>>>>        See the EXAMPLE section for an example of the use of these  opera‐
>>>>        tions.
>> So, after playing with this a bit, I have a question. 
>> I gather that in order to, for example, elaborate the tree of user
>> namespaces on the system, one would use NS_GET_PARENT on each of
>> the /proc/*/ns/user files and match up the results. Right?
>> What happens if one of the parent user namespaces contains no
>> processes? That is, the parent namespace exists by virtue of being
>> pinned because a proc/PID/ns/user file is open or bind mounted.
>> (Chrome seems to do this sort of dance with user namespaces, for
>> example.) How do we find the ancestor of *that* user namespace?
> What is returned from NS_GET_USERNS and NS_GET_PARENT is a file
> descriptor, that you can call NS_GET_PARENT on.

Thanks, Eric. While trying to solve the small task I set myself,
and probably confused by past discussions[1], I was overlooking
the obvious.




Michael Kerrisk
Linux man-pages maintainer;
Linux/UNIX System Programming Training:

More information about the Containers mailing list