[PATCHv1 7/8] cgroup: cgroup namespace setns support

Andy Lutomirski luto at amacapital.net
Thu Oct 16 21:17:18 UTC 2014


On Thu, Oct 16, 2014 at 2:12 PM, Serge E. Hallyn <serge at hallyn.com> wrote:
> Quoting Aditya Kali (adityakali at google.com):
>> setns on a cgroup namespace is allowed only if
>> * task has CAP_SYS_ADMIN in its current user-namespace and
>>   over the user-namespace associated with target cgroupns.
>> * task's current cgroup is descendent of the target cgroupns-root
>>   cgroup.
>
> What is the point of this?
>
> If I'm a user logged into
> /lxc/c1/user.slice/user-1000.slice/session-c12.scope and I start
> a container which is in
> /lxc/c1/user.slice/user-1000.slice/session-c12.scope/x1
> then I will want to be able to enter the container's cgroup.
> The container's cgroup root is under my own (satisfying the
> below condition0 but my cgroup is not a descendent of the
> container's cgroup.
>

Presumably you need to ask your friendly cgroup manager to stick you
in that cgroup first.  Or we need to generally allow tasks to move
themselves deeper in the hierarchy, but that seems like a big change.

--Andy

>
>> * target cgroupns-root is same as or deeper than task's current
>>   cgroupns-root. This is so that the task cannot escape out of its
>>   cgroupns-root. This also ensures that setns() only makes the task
>>   get restricted to a deeper cgroup hierarchy.
>>
>> Signed-off-by: Aditya Kali <adityakali at google.com>
>> ---
>>  kernel/cgroup_namespace.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 42 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/cgroup_namespace.c b/kernel/cgroup_namespace.c
>> index c16604f..c612946 100644
>> --- a/kernel/cgroup_namespace.c
>> +++ b/kernel/cgroup_namespace.c
>> @@ -80,8 +80,48 @@ err_out:
>>
>>  static int cgroupns_install(struct nsproxy *nsproxy, void *ns)
>>  {
>> -     pr_info("setns not supported for cgroup namespace");
>> -     return -EINVAL;
>> +     struct cgroup_namespace *cgroup_ns = ns;
>> +     struct task_struct *task = current;
>> +     struct cgroup *cgrp = NULL;
>> +     int err = 0;
>> +
>> +     if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN) ||
>> +         !ns_capable(cgroup_ns->user_ns, CAP_SYS_ADMIN))
>> +             return -EPERM;
>> +
>> +     /* Prevent cgroup changes for this task. */
>> +     threadgroup_lock(task);
>> +
>> +     cgrp = get_task_cgroup(task);
>> +
>> +     err = -EINVAL;
>> +     if (!cgroup_on_dfl(cgrp))
>> +             goto out_unlock;
>> +
>> +     /* Allow switch only if the task's current cgroup is descendant of the
>> +      * target cgroup_ns->root_cgrp.
>> +      */
>> +     if (!cgroup_is_descendant(cgrp, cgroup_ns->root_cgrp))
>> +             goto out_unlock;
>> +
>> +     /* Only allow setns to a cgroupns root-ed deeper than task's current
>> +      * cgroupns-root. This will make sure that tasks cannot escape their
>> +      * cgroupns by attaching to parent cgroupns.
>> +      */
>> +     if (!cgroup_is_descendant(cgroup_ns->root_cgrp,
>> +                               task_cgroupns_root(task)))
>> +             goto out_unlock;
>> +
>> +     err = 0;
>> +     get_cgroup_ns(cgroup_ns);
>> +     put_cgroup_ns(nsproxy->cgroup_ns);
>> +     nsproxy->cgroup_ns = cgroup_ns;
>> +
>> +out_unlock:
>> +     threadgroup_unlock(current);
>> +     if (cgrp)
>> +             cgroup_put(cgrp);
>> +     return err;
>>  }
>>
>>  static void *cgroupns_get(struct task_struct *task)
>> --
>> 2.1.0.rc2.206.gedb03e5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Andy Lutomirski
AMA Capital Management, LLC


More information about the Containers mailing list