[RFC][v8][PATCH 0/10] Implement clone3() system call

Eric W. Biederman ebiederm at xmission.com
Tue Oct 20 03:46:06 PDT 2009


Sukadev Bhattiprolu <sukadev at linux.vnet.ibm.com> writes:

> Eric W. Biederman [ebiederm at xmission.com] wrote:
> | > clone3() seemed to be the leading contender from what I've read so far.
> | > Does anyone still object to clone3() after reading the whole thread?
> | 
> | I object to what clone3() is.  The name is not particularly interesting.
> | 
> | The sanity checks for assigning pids are missing and there is a todo
> | about it.  I am not comfortable with assigning pids to a new process
> | in a pid namespace with other processes user space processes executing
> | in it.
>
> Could you clarify ? How is the call to alloc_pidmap() from clone3() different
> from the call from clone() itself ?

I think it is totally inappropriate to assign pids in a pid namespace
where there are user space processes already running.

> | How we handle a clone extension depends critically on if we want to
> | create a processes for restart in user space or kernel space.
> | 
> | Could some one give me or point me at a strong case for creating the
> | processes for restart in user space?
>
> There has been a lot of discussion on this with reference to the
> Checkpoint/Restart patchset. See http://lkml.org/lkml/2009/4/13/401
> for instance.

Just read it.  Thank you.  Now I am certain clone_with_pids() is
not useful functionality to be exporting to userspace.

The only real argument in favor of doing this in user space is greater
flexibility.  I can see checkpointing/restoring a single thread process
without a pid namespace.  Anything more and you are just asking for
trouble.

A design that weakens security.  Increases maintenance costs.  All for
an unreliable result seems like a bad one to me.

> | The pid assignment code is currently ugly.  I asked that we just pass
> | in the min max pid pids that already exist into the core pid
> | assignment function and a constrained min/max that only admits a
> | single pid when we are allocating a struct pid for restart.  That was
> | not done and now we have a weird abortion with unnecessary special cases.
>
> I did post a version of the patch attemptint to implement that. As
> pointed out in:
>
> 	http://lkml.org/lkml/2009/8/17/445
>
> we would need more checks in alloc_pidmap() to cover cases like min or max
> being invalid or min being greater than max or max being greater than pid_max
> etc. Those checks also made the code ugly (imo).

If you need more checks you are doing it wrong.  The code already has min
and max values, and even a start value.  I was just strongly suggesting
we generalize where we get the values from, and then we have not special
cases. 

Eric



More information about the Containers mailing list