[v11][PATCH 9/9] Document clone_with_pids() syscall

Matt Helsley matthltc at us.ibm.com
Fri Nov 6 13:45:29 PST 2009


On Fri, Nov 06, 2009 at 12:18:14PM -0800, Matt Helsley wrote:
> On Fri, Nov 06, 2009 at 12:39:36PM -0600, Serge E. Hallyn wrote:
> > Quoting Sukadev Bhattiprolu (sukadev at us.ibm.com):
> > ...
> > > +	If a pid in the @pids list is non-zero, the kernel tries to assign
> > > +	the specified pid in that namespace.  If that pid is already in use
> > > +	by another process, the system call fails (see EBUSY below).
> > > +
> > > +	The order of pids in @pids is oldest in pids[0] to youngest pid
> > > +	namespace in pids[nr_pids-1]. If the number of pids specified in the
> > 
> > In the sys_choosepid() discussion, Matt suggested it would be more
> > user-friendly to have the pid for the youngest pidns be pids[0].
> > That way the user doesn't have to know their pidns depth.
> 
> As far as I could see, Suka's solution also does not require knowing
> the pidns depth (aka level). He made it so that copy_from_user()
> adjusts its destination using the discrepancy between the number of
> pids passed and the number of levels.
> 
> If userspace passes an array with n pids and there are k namespace levels
> then clone_with_pids() makes sure that the kernel sees a pid array like:
> 
> index	  0     ... k - (n + 1)        ...          k - 1
> 	+-----------------------+-------------------------+
> pid_t	| 0 ..................0 | <copied from userspace> |
> 	+-----------------------+-------------------------+

(diagram assumes n != k. If n == k then pids[0] is the pid desired
in the initial namespace..)

> 
> So even though the order is different from choosepid() the calling
> task still doesn't need to know its pidns level. Of course, just
> like choosepid(), n <= k or userspace will get EINVAL.

Forgot to mention that I prefer the way choosepid orders the pids.
It's not inspired by the way that the kernel implements pid namespaces
and has more to do with the way userspace sees things (IMHO). I don't
know if it makes more sense to change clone_with_pids() or have
[e]glibc wrappers swap the array contents.

Cheers,
	-Matt Helsley


More information about the Containers mailing list