[PATCH 1/1] namespaces: introduce sys_hijack (v11)
Serge E. Hallyn
serue at us.ibm.com
Fri Aug 1 07:22:06 PDT 2008
Quoting KOSAKI Motohiro (kosaki.motohiro at jp.fujitsu.com):
> fork() is very important for system performance.
> I worry about performance regression if this feature isn't used.
> Could you mesure spawn benchmark in unixbench?
Yes, I will do that. I'll send the results next week.
Some context and history about this patch:
1. Originally the 'enter a namespace' functionality was
implemented (in vserver and in -lxc, maybe also in
openvz) by specifying an integer id for a container
to enter. See for instance the patches under
2. At one point I started extending the ns_cgroup to
allow namespace entering by attaching to an ns_cgroup.
So 'echo $$ > /containers/5487/tasks' would switch
your nsproxy. People didn't like that much in part
because switching the namespaces while a task is
executing seems frought with potential dangers,
though those remain unexplored.
3. Eric suggested that namespace entering should just
be done by ptracing a process in a container and
bending it to our will. The hijack patchset was
sort of a response to that. Strictly using ptrace
seemed to me to leave to leave us too much at the
mercy of the target's context.
For instance, if using selinux, when you ptrace a
target you are subject to the target's permissions.
With hijack, you'll continue with your own,
presumably administrator-level, privilege level.
4. We stopped posting this patch some time ago because
it seems worth seeing just how far we can go in
terms of analyzing and configuring containers by
properly setting up namespaces (i.e. hierarchical
pid and user namespaces) and using filesystems
(in conjunction with mounts propagation) to do the
But at the mini-summit last week it was mentioned
that OpenVZ still wants namespace entering, and
there seemed to be no real opposition, so we told
Pavel we would repost this patchset.
More information about the Containers