[RFC][PATCH] ns: Syscalls for better namespace sharing control.

Daniel Lezcano daniel.lezcano at free.fr
Mon Mar 8 13:12:50 PST 2010


Eric W. Biederman wrote:
> Daniel Lezcano <daniel.lezcano at free.fr> writes:
>
>   
>> Eric W. Biederman wrote:
>>     
>>> Daniel Lezcano <daniel.lezcano at free.fr> writes:
>>>
>>>   
>>>       
>>>> Eric W. Biederman wrote:
>>>>     
>>>>         
>>>>> Daniel Lezcano <daniel.lezcano at free.fr> writes:
>>>>>
>>>>>         
>>>>>           
>>>>>> Eric W. Biederman wrote:
>>>>>>             
>>>>>>             
>>>>>>> I have take an snapshot of my development tree and placed it at.
>>>>>>>
>>>>>>>
>>>>>>> git://git.kernel.org/pub/scm/linux/people/ebiederm/linux-2.6.33-nsfd-v5.git
>>>>>>>                   
>>>>>>>               
>>>>>> Hi Eric,
>>>>>>
>>>>>> thanks for the pointer.
>>>>>>
>>>>>> I tried to boot the kernel under qemu and I got this oops:
>>>>>>             
>>>>>>             
>>>>> I am clearly running an old userspace on my test machine.  No udev.
>>>>> It looks like udev has a long standing netlink misfeature, where
>>>>> it does not initializing NETLINK_CB....
>>>>>
>>>>>
>>>>> >From 8d85e3ab88718eda3d94cf8e1be14b69dae2b8f1 Mon Sep 17 00:00:00 2001
>>>>> From: Eric W. Biederman <ebiederm at xmission.com>
>>>>> Date: Mon, 8 Mar 2010 09:25:20 -0800
>>>>> Subject: [PATCH] kobject_uevent:  Use the netlink allocator helper...
>>>>>
>>>>> Signed-off-by: Eric W. Biederman <ebiederm at xmission.com>
>>>>>         
>>>>>           
>>>> Thanks.
>>>>
>>>> I was able to boot but I have the following warning:
>>>>     
>>>>         
>>> Thanks for the bug report.
>>>   
>>>       
>> Thanks to you for the patchset :)
>>
>>     
>>> For the moment you might want to drop:
>>> af_netlink:  Allow credentials to work across namespaces.
>>> af_netlink: Debugging in case I have missed something.
>>>
>>> Although I am curious if you hit my debugging messages in
>>> netlink recv.
>>>   
>>>       
>> No, it does not appear (looked for "missing NETLINK_CB proto").
>>
>>     
>>> I guess if the goal is to test my nsfd bits you can drop everything
>>> starting with my 'scm: Reorder scm_cookie.' commit.  The rest is what
>>> it takes to get get uids, gid and pids translated when the cross
>>> namespaces on an af_unix of an af_netlink socket.
>>>
>>> At least in the af_netlink case it appears clear I am have missed
>>> something.
>>>
>>> This is a warning that netlink throws when the packet accounting messed
>>> up.  So it sounds like you are exercising another path that I failed
>>> to exercise and fix.
>>>   
>>>       
>> I will look forward if I find more clues for this warning.
>>
>> In the meantime  was able to enter the container with the ugly following
>> program:
>>
>> #include <unistd.h>
>> #include <stdlib.h>
>> #include <stdio.h>
>> #include <syscall.h>
>> #include <sys/types.h>
>> #include <sys/stat.h>
>> #include <fcntl.h>
>> #include <sys/param.h>
>>
>> #define __NR_setns 300
>>
>> int setns(int nstype, int fd)
>> {
>>    return syscall (__NR_setns, nstype, fd);
>> }
>>
>> int main(int argc, char *argv[])
>> {
>>    char path[MAXPATHLEN];
>>    char *ns[] = { "pid", "mnt", "net", "pid", "uts" };
>>    const int size = sizeof(ns) / sizeof(char *);
>>    int fd[size];
>>    int i;
>>
>>    if (argc != 3) {
>>        fprintf(stderr, "mynsenter <pid> <command>\n");
>>        exit(1);
>>    }
>>
>>    for (i = 0; i < size; i++) {
>>            sprintf(path, "/proc/%s/ns/%s", argv[1], ns[i]);
>>
>>        fd[i] = open(path, O_RDONLY);
>>        if (fd[i] < 0) {
>>            perror("open");
>>            return -1;
>>        }
>>
>>    }
>>
>>    for (i = 0; i < size; i++) {
>>
>>        if (setns(0, fd[i])) {
>>            perror("setns");
>>            return -1;
>>        }
>>    }
>>
>>    execve(argv[2], &argv[2], NULL);
>>    perror("execve");
>>
>>    return 0;
>> }
>>
>> At the fist glance, no problem :)
>>     
>
> No fork() so your processes is completely in the pid namespace?
>   
What I do is to attach "/bin/sh" to the container with this program.
The container is a VPS running busybox with the full isolation.

echo $$ gives the real pid.
All the forked processes appears in the pid namespace, they are visible 
through /proc with the virtual pid.
I am not able to change to the /proc/self directory (I assume this is 
normal).




More information about the Containers mailing list