[PATCHv1 7/8] cgroup: cgroup namespace setns support

Eric W. Biederman ebiederm at xmission.com
Sun Oct 19 05:23:39 UTC 2014

"Serge E. Hallyn" <serge at hallyn.com> writes:

> Quoting Aditya Kali (adityakali at google.com):
>> On Thu, Oct 16, 2014 at 2:12 PM, Serge E. Hallyn <serge at hallyn.com> wrote:
>> > Quoting Aditya Kali (adityakali at google.com):
>> >> setns on a cgroup namespace is allowed only if
>> >> * task has CAP_SYS_ADMIN in its current user-namespace and
>> >>   over the user-namespace associated with target cgroupns.
>> >> * task's current cgroup is descendent of the target cgroupns-root
>> >>   cgroup.
>> >
>> > What is the point of this?
>> >
>> > If I'm a user logged into
>> > /lxc/c1/user.slice/user-1000.slice/session-c12.scope and I start
>> > a container which is in
>> > /lxc/c1/user.slice/user-1000.slice/session-c12.scope/x1
>> > then I will want to be able to enter the container's cgroup.
>> > The container's cgroup root is under my own (satisfying the
>> > below condition0 but my cgroup is not a descendent of the
>> > container's cgroup.
>> >
>> This condition is there because we don't want to do implicit cgroup
>> changes when a process attaches to another cgroupns. cgroupns tries to
>> preserve the invariant that at any point, your current cgroup is
>> always under the cgroupns-root of your cgroup namespace. But in your
>> example, if we allow a process in "session-c12.scope" container to
>> attach to cgroupns root'ed at "session-c12.scope/x1" container
>> (without implicitly moving its cgroup), then this invariant won't
>> hold.
> Oh, I see.  Guess that should be workable.  Thanks.

Which has me looking at what the rules are for moving through
the cgroup hierarchy.

As long as we have write access to cgroup.procs and are allowed
to open the file for write, we can move any of our own tasks
into the cgroup.  So the cgroup namespace rules don't seem
to be a problem.

Andy can you please take a look at the permission checks in

As I read the code I see 3 security gaffaws in the permssion check.
- Using current->cred instead of file->f_cred.
- Not checking tcred->euid.
- Checking GLOBAL_ROOT_UID instead of having a capable call.

The file permission on cgroup.procs seem just sufficient to keep
to keep those bugs from being easily exploitable.


More information about the Containers mailing list