[RFC][PATCH] ns: Syscalls for better namespace sharing control.

Pavel Emelyanov xemul at parallels.com
Sat Feb 27 01:21:53 PST 2010


Eric W. Biederman wrote:
> Pavel Emelyanov <xemul at parallels.com> writes:
> 
>> Eric W. Biederman wrote:
>>> Pavel Emelyanov <xemul at parallels.com> writes:
>>>
>>>>>> Yet another set of per-namespace IDs along with CLONE_NEWXXX ones?
>>>>>> I currently have a way to create all namespaces we have with one
>>>>>> syscall. Why don't we have an ability to enter them all with one syscall?
>>>>> The CLONE_NEWXXX series of bits has been an royal pain to work with,
>>>>> and it appears to be unnecessary complications for no gain.
>>>> That's the answer for the "Yet another set..." question.
>>>> How about the "Why don't we have..." one?
>>> I am not certain which question you are asking:
>>>
>>> Why don't we have an ability to enter all namespaces with one syscall
>>> invocation?
>> Exactly. Please add at least the NSTYPE_NSPROXY or whatever, that will
>> pin all namespaces of a given pid from the very beginning.
> 
> For nsfd(2) that is doable.  At least for now setns can't restore it.

Thanks. What's the problem with setns?

>>> Why don't we have a syscall that allows us to enter every namespace?
>> This one is done in the patch, no?
>>
>> Although the approach is OK for me, there's one design issue, that came
>> up to my mind recently: can we use this fd to wail for a namespace to 
>> stop? I currently don't see this ability, but this is something I require
>> badly.
> 
> I have designed these file descriptors to pin the namespaces, so
> waiting for them to exit isn't something they can do now.  It makes a
> lot of sense to have similar ones that take  weak references to the namespaces
> that we can use to wait for a namespace to exit.

Yes, I saw this from patches. Eric, I'd very much appreciate if we
workout a solution that will allow us to kill two birds with one stone.
I do not want to invent yet another bunch of system calls for "taking
weak reference".

As a "brain storm" start up. Can we use inotify/dnotify for this? 
Or maybe we should better equip the nsfd call with flags argument and 
add a flag for weak reference? In that case - how shall we get a 
notification about namespace is dead? With poll? Maybe worth making
the sys_close return only when the namespace is dead (by providing a
proper ->release callback of a file)?

> Eric
> 



More information about the Containers mailing list