[PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups

Serge E. Hallyn serue at us.ibm.com
Fri Dec 19 15:34:13 PST 2008

Quoting Grzegorz Nosek (root at localdomain.pl):
> On pią, gru 19, 2008 at 04:23:04 -0600, Serge E. Hallyn wrote:
> > Quoting Andrew Morton (akpm at linux-foundation.org):
> > > (cc containers at lists.osdl.org)
> > > 
> > > Please don't send patches via private email!
> My apologies.
> > I trust (since you're not removing it) that the restriction that
> > the target cgroup be empty is not a problem?
> Sigh, good catch. I'm building my lxc-based environment slowly and I'm
> only testing the most basic stuff currently, so I'd bug you about it
> eventually.
> Frankly, I don't understand the reason behind these restrictions and
> feel like I'm missing some important piece of a puzzle. In my tests all
> the tasks in question are living in the same namespace (though it won't
> always be so), so I'd guess I should be able to move the tasks freely
> between cgroups. Why exactly does the target cgroup have to be empty?

The reasoning goes back to one motivation of the ns cgroup being
to facilitate actual moves between namespaces.  Since that is no
longer being considered, easing the restrictions is ok.

On the other hand, the only remaining use for the ns cgroup is to
provide some locking of tasks/containers into cgroups.  So the main
restriction I'd like to keep in place is that you can only go
downward in cgroup hierarchy.  (Think devices whitelist cgroup).

Now maybe it makes sense to split the two things ns_cgroup does
into 2+ cgroups: one (nstrack) would simply clone a new child
cgroup every time a task does an unshare, another (if mounted)
prevents a task from moving to a cgroup which isn't a desendent,
while a third could do more complicated controls, i.e. a task
could be locked into cgroup:/a/b, after which it could freely
move up and down under cgroup:/a/b, (i.e. to switch between
cgroup:/a/b/c1 and cgroup:/a/b/c2), but could never escape
cgroup:/a/b.  Then you could choose which if any movement-controlling
and movement-tracking cgroups to compose.

I actually rather like that idea, but I think we'd have to
keep the ns cgroup the way it is, while using new cgroups to
offer the new functionality.

> Also, should we remember the task->nsproxy pointer in the cgroup data
> and ignore hierarchy if it matches? I guess it would be safe to store
> the raw pointer without refcounting it in any way as we'd never
> dereference it (could keep it as uintptr_t to reinforce the idea) but
> only compare with another pointer.

No, I'm thinking that despite the name, since we wont' use it to
actually enter namespaces, we should keep it decoupled from nsproxy.

> Does that make any sense? Or should I simply mount the cgroup fs without
> the ns subsystem and forget the whole thing? What exactly do I lose by
> doing so?

With liblxc I think you might lose the way that it keeps track of
containers.  Not sure though - give it a shot.

> > Also, 'rule 1' in the comment above ns_can_attach should be modified
> > accordingly (s/child/descendant).
> Indeed. Will resend after receiving some enlightenment about the above.
> Thank you for your comments.
> Best regards,
>  Grzegorz Nosek

More information about the Containers mailing list