[RFC][PATCH] ns: Syscalls for better namespace sharing control.

Matt Helsley matthltc at us.ibm.com
Thu Feb 25 17:09:15 PST 2010


On Thu, Feb 25, 2010 at 12:57:02PM -0800, Eric W. Biederman wrote:
> 
> Introduce two new system calls:
> int nsfd(pid_t pid, unsigned long nstype);
> int setns(unsigned long nstype, int fd);
> 
> These two new system calls address three specific problems that can
> make namespaces hard to work with.
> - Namespaces require a dedicated process to pin them in memory.
> - It is not possible to use a namespace unless you are the
>   child of the original creator.
> - Namespaces don't have names that userspace can use to talk
>   about them.
> 
> The nsfd() system call returns a file descriptor that can
> be used to talk about a specific namespace, and to keep
> the specified namespace alive.
> 
> The fd returned by nsfd() can be bind mounted as:
> mount --bind /proc/self/fd/N /some/filesystem/path
> to keep the namespace alive indefinitely as long as
> it is mounted.
> 
> open works on the fd returned by nsfd() so another
> process can get a hold of it and do interesting things.
> 
> Overall that allows for persistent naming of namespaces
> according to userspace policy.
> 
> setns() allows changing the namespace of the current process
> to a namespace that originates with nsfd().
> 
> Signed-off-by: Eric W. Biederman <ebiederm at xmission.com>
> ---
> 
> This is just my first pass at this, and not yet compiled tested.
> I was pleasantly surprised at how easy all of this was to implement.

<snip>

> +SYSCALL_DEFINE2(setns, unsigned long, nstype, int, fd)
> +{
> +	struct file *file;
> +
> +	if (!capable(CAP_SYS_ADMIN))
> +		return -EPERM;

Is this check preliminary? In the future would we check against the
owner of the target namespace too? Naturally that will require tagging
each namespace with an owner but I thought that was already part of the
plan...

Cheers,
	-Matt Helsley


More information about the Containers mailing list