[PATCH 1/1] namespaces: introduce sys_hijack (v11)

Serge E. Hallyn serue at us.ibm.com
Fri Aug 1 07:22:06 PDT 2008


Quoting KOSAKI Motohiro (kosaki.motohiro at jp.fujitsu.com):
> Hi
> 
> fork() is very important for system performance.
> I worry about performance regression if this feature isn't used.
> 
> Could you mesure spawn benchmark in unixbench?

Yes, I will do that.  I'll send the results next week.

Some context and history about this patch:

	1. Originally the 'enter a namespace' functionality was
	   implemented (in vserver and in -lxc, maybe also in
	   openvz) by specifying an integer id for a container
	   to enter.  See for instance the patches under
	   http://lxc.sourceforge.net/patches/2.6.25/2.6.25-rc8-mm2-lxc1/broken-out/bind_ns/

	2. At one point I started extending the ns_cgroup to
	   allow namespace entering by attaching to an ns_cgroup.
	   So 'echo $$ > /containers/5487/tasks' would switch
	   your nsproxy.  People didn't like that much in part
	   because switching the namespaces while a task is
	   executing seems frought with potential dangers,
	   though those remain unexplored.

	3. Eric suggested that namespace entering should just
	   be done by ptracing a process in a container and
	   bending it to our will.  The hijack patchset was
	   sort of a response to that.  Strictly using ptrace
	   seemed to me to leave to leave us too much at the
	   mercy of the target's context.
	   For instance, if using selinux, when you ptrace a
	   target you are subject to the target's permissions.
	   With hijack, you'll continue with your own,
	   presumably administrator-level, privilege level.

	4. We stopped posting this patch some time ago because
	   it seems worth seeing just how far we can go in
	   terms of analyzing and configuring containers by
	   properly setting up namespaces (i.e. hierarchical
	   pid and user namespaces) and using filesystems
	   (in conjunction with mounts propagation) to do the
	   configuration.
	   But at the mini-summit last week it was mentioned
	   that OpenVZ still wants namespace entering, and
	   there seemed to be no real opposition, so we told
	   Pavel we would repost this patchset.

thanks,
-serge


More information about the Containers mailing list