[RFC][v7][PATCH 0/9] Implement clone2() system call
orenl at librato.com
Wed Sep 30 10:41:45 PDT 2009
Alexey Dobriyan wrote:
> On Thu, Sep 24, 2009 at 01:35:56PM -0500, Serge E. Hallyn wrote:
>> Quoting Alexey Dobriyan (adobriyan at gmail.com):
>>> I don't like this even more.
>>> Pid namespaces are hierarchical _and_ anonymous, so simply
>>> set of numbers doesn't describe the final object.
>>> struct pid isn't special, it's just another invariant if you like
>>> as far as C/R is concerned, but system call is made special wrt pids.
>>> What will be in an image? I hope "struct kstate_image_pid" with several
>> Sure pid namespaces are anonymous, but we will give each an 'objref'
>> valid only for a checkpoint image, and store the relationship between
>> pid namespaces based on those objrefs. Basically the same way that user
>> structs and hierarchical user namespaces are handled right now.
> OK, that's certainly doable.
> You're commiting yourself to creation of tasks in userspace if this goes in. :-\
> Which can let you into putting wrong kind of relations into image.
A malicious user can put "wrong" king of relations into the image,
regardless of whether the tasks are created in the kernel or in
userspace. As long as the creation follows the "instructions" in
the image, the result would be the same.
> IIRC, clone_flags were in image (still?), but tomorrow kernel will get
> new way to acquire, say, uts_ns, which, in theory, can't be described by
> a set of consecutive clones, so, you'll have to fixup something in kernel.
The only thing enforced in user space is task relations, threads
and (as a by-product) session id's. The rest are refined in the
kernel. This includes uts_ns, for example.
(FWIW, _any_ clone relationships can be described by a set of
clones. In particular because that's how they were constructed
in the first place).
More information about the Containers