[PATCH 0/5 RFC] Add an interface to discover relationships between namespaces

Michael Kerrisk (man-pages) mtk.manpages at gmail.com
Mon Jul 25 11:47:51 UTC 2016


Hi Andrey,

On 07/22/2016 08:25 PM, Andrey Vagin wrote:
> On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
> <mtk.manpages at gmail.com> wrote:
>> Hi Andrey,
>>
>>
>> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>>
>>> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
>>> wrote:
>>>>
>>>> Hi Andrey,
>>>>
>>>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>>>
>>>
>>> <snip>
>>>
>>>>
>>>> Could you add here an of the API in detail: what do these FDs refer to,
>>>> and how do you use them to solve the use case? And could you you add
>>>> that info to the commit messages please.
>>>
>>>
>>> Hi Michael,
>>>
>>> A patch for man-pages is attached. It adds the following text to
>>> namespaces(7).
>>>
>>> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
>>> pace file descriptors.  The correct syntax is:
>>>
>>>       fd = ioctl(ns_fd, ioctl_type);
>>>
>>> where ioctl_type is one of the following:
>>>
>>> NS_GET_USERNS
>>>       Returns a file descriptor that refers to an owning  user  names‐
>>>       pace.
>>>
>>> NS_GET_PARENT
>>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>>       ing.

For each of the above, I think it is worth mentioning that the
close-on-exec flag is set for the returned file descriptor.

>>>
>>> In addition to generic ioctl(2) errors, the following specific ones can
>>> occur:
>>>
>>> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>>>
>>> EPERM  The  requested  namespace  is  outside  of the current namespace
>>>       scope.

Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
user namespace"?

>>>
>>> ENOENT ns_fd refers to the init namespace.
>>
>>
>> Thanks for this. But still part of the question remains unanswered.
>> How do we (in user-space) use the file descriptors to answer any of
>> the questions that this patch series was designed to solve? (This
>> info should be in the commit message and the man-pages patch.)
>
> I'm sorry, but I am not sure that I understand what you ask.
>
> Here are the origin questions:
> Someone else then asked me a question that led me to wonder about
> generally introspecting on the parental relationships between user
> namespaces and the association of other namespaces types with user
> namespaces. One use would be visualization, in order to understand the
> running system. Another would be to answer the question I already
> mentioned: what capability does process X have to perform operations
> on a resource governed by namespace Y?
>
> Here is an example which shows how we can get the owning namespace
> inode number by using these ioctl-s.
>
> $ ls -l /proc/13929/ns/pid
> lrwxrwxrwx 1 root root 0 Jul 22 21:03 /proc/13929/ns/pid -> 'pid:[4026532228]'
>
> $ ./nsowner /proc/13929/ns/pid
> user:[4026532227]
>
> The owning user namespace for pid:[4026532228] is user:[4026532227].
>
> The nsowner  tool is cimpiled from this code:
>
> int main(int argc, char *argv[])
> {
>         char buf[128], path[] = "/proc/self/fd/0123456789";
>         int ns, uns, ret;
>
>         ns = open(argv[1], O_RDONLY);
>         if (ns < 0)
>                 return 1;
>
>         uns = ioctl(ns, NS_GET_USERNS);
>         if (uns < 0)
>                 return 1;
>
>         snprintf(path, sizeof(path), "/proc/self/fd/%d", uns);
>         ret = readlink(path, buf, sizeof(buf) - 1);
>         if (ret < 0)
>                 return 1;
>         buf[ret] = 0;
>
>         printf("%s\n", buf);
>
>         return 0;
> }

So, from my point of view, the important piece that was missing from
your commit message was the note to use readlink("/proc/self/fd/%d")
on the returned FDs. I think that detail needs to be part of the
commit message (and also the man page text). I think it even be
helpful to include the above program as part of the commit message:
it helps people more quickly grasp the API.

> Does this example answer to the origin question?

Yes.

>If it isn't, could
> you eloborate what you expect to see here.
>
> And I wrote one more example which show all relationships between
> namespaces. It enumirates all processes in a system, collects all
> namespaces and determins parent and owning namespaces for each of
> them, then it constructs a namespace tree and shows it.
>
> Here is a code: https://gist.github.com/avagin/db805f95e15ffb0af7e559dbb8de4418

That's great! Thanks!
  
> Here is an example of output for my test system:
> [root at fc24 nsfs]# ./nstree
> user:[4026531837]
>  \__  mnt:[4026532203]
>  \__  ipc:[4026531839]
>  \__  user:[4026532224]
>      \__  user:[4026532226]
>          \__  user:[4026532227]
>              \__  pid:[4026532228]
>      \__  pid:[4026532225]
>          \__  pid:[4026532228]
>  \__  user:[4026532221]
>      \__  pid:[4026532222]
>      \__  user:[4026532223]
>  \__  mnt:[4026532211]
>  \__  uts:[4026531838]
>  \__  cgroup:[4026531835]
>  \__  pid:[4026531836]
>      \__  pid:[4026532225]
>          \__  pid:[4026532228]
>      \__  pid:[4026532222]
>  \__  mnt:[4026531857]
>  \__  mnt:[4026531840]
>  \__  net:[4026531957]

Cheers,

Michael

>>>>> [1] https://lkml.org/lkml/2016/7/6/158
>>>>> [2] https://lkml.org/lkml/2016/7/9/101

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/


More information about the Containers mailing list