[0/10] User namespaces: introduction

Serge E. Hallyn serue at us.ibm.com
Fri Aug 22 12:45:13 PDT 2008


Hi Eric,

so here is a start to a userns patchset trying to follow your ideas
about how to have user namespaces and filesystems interact.  Ignore
the bookkeeping crap or you'll pull your hair out.  Lots of stuff
remains unimplemented - i.e. chown (setattr) and proper handling of
capabilities.  But you can do some fun things with this patchset.
I.e.

	(log in as root)
	setcap cap_sys_admin=ep ns_exec
	setcap cap_sys_admin=ep usernsmount
	ns_exec -U /bin/sh
	ls /root (fails)
	ls / (succeeds)
	(log in as hallyn)
	ns_exec -U /bin/sh
	id
		(uid=0, gid=0)
	ls (fails, can't descend /home/hallyn)
	usernsmount / nsid=4
	ls (succeeds)
	touch ab
	ls -l ab
		(ab is owned by root)
	exit
	(we're logged in as hallyn in the init_user_ns again)
	ls -l ab
		(ab is owned by hallyn)

The only supported fs is ext3.  Only a few operations are supported.
So if, above, when we are hallyn in the init_user_ns but root in
the child user ns,
	when we create a file, it is properly handled, so
		inode->i_uid=500, but an xattr (nsid=4,uid=0) is added
	when we chown the file to root, it is not properly handled,
		so inode->i_uid = 0
it's just a matter of hooking all the places at this point.

Capabilities remain a problem.  Right now I think capabilities will
need to be split up into system-wide caps, and container-safe caps.
So CAP_NET_ADMIN, CAP_NET_RAW, CAP_DAC_OVERRIDE, those are container-safe.
CAP_REBOOT may become container-safe one day, but for now is very
much system-wide.

So if I'm uid 500 on the host and create a user namespace where I'm
uid=0, I should be able to acquire container-safe caps (perhaps
contingent on whether I unshared all other namespaces), but not
system-wide ones.  Or, whether I can acquire them would depend
on whether the suid bit was set in a user_ns or not.  sigh.

thanks,
-serge


More information about the Containers mailing list