[PATCHv1 7/8] cgroup: cgroup namespace setns support
luto at amacapital.net
Tue Oct 21 05:03:46 UTC 2014
On Mon, Oct 20, 2014 at 9:49 PM, Eric W. Biederman
<ebiederm at xmission.com> wrote:
> Andy Lutomirski <luto at amacapital.net> writes:
>> On Sun, Oct 19, 2014 at 9:55 PM, Eric W.Biederman <ebiederm at xmission.com> wrote:
>>> On October 19, 2014 1:26:29 PM CDT, Andy Lutomirski <luto at amacapital.net> wrote:
>>>> Is the idea
>>>>that you want a privileged user wrt a cgroupns's userns to be able to
>>>>use this? If so:
>>>>Yes, that current_cred() thing is bogus. (Actually, this is probably
>>>>exploitable right now if any cgroup.procs inode anywhere on the system
>>>>lets non-root write.) (Can we have some kernel debugging option that
>>>>makes any use of current_cred() in write(2) warn?)
>>>>We really need a weaker version of may_ptrace for this kind of stuff.
>>>>Maybe the existing may_ptrace stuff is okay, actually. But this is
>>>>completely missing group checks, cap checks, capabilities wrt the
>>>>Also, I think that, if this version of the patchset allows non-init
>>>>userns to unshare cgroupns, then the issue of what permission is
>>>>needed to lock the cgroup hierarchy like that needs to be addressed,
>>>>because unshare(CLONE_NEWUSER|CLONE_NEWCGROUP) will effectively pin
>>>>the calling task with no permission required. Bolting on a fix later
>>>>will be a mess.
>>> I imagine the pinning would be like the userns.
>>> Ah but there is a potentially serious issue with the pinning.
>>> With pinning we can make it impossible for root to move us to a different cgroup.
>>> I am not certain how serious that is but it bears thinking about.
>>> If we don't implement pinning we should be able to implent everything with just filesystem mount options, and no new namespace required.
>>> I am too tired tonight to see the end game in this.
>> Possible solution:
>> Ditch the pinning. That is, if you're outside a cgroupns (or you have
>> a non-ns-confined cgroupfs mounted), then you can move a task in a
>> cgroupns outside of its root cgroup. If you do this, then the task
>> thinks its cgroup is something like "../foo" or "../../foo".
> Of the possible solutions that seems attractive to me, simply because
> we sometimes want to allow clever things to occur.
> Does anyone know of a reason (beyond pretty printing) why we need
> cgroupns to restrict the subset of cgroups processes can be in?
> I would expect permissions on the cgroup directories themselves, and
> limited visiblilty would be (in general) to achieve the desired
This makes the security impact of cgroupns very easy to understand,
right? Because there really won't be any -- cgroupns only affects
reads from /proc and what cgroupfs shows, but it doesn't change any
actual cgroups, nor does it affect any cgroup *changes*.
>> While we're at it, consider making setns for a cgroupns *not* change
>> the caller's cgroup. Is there any reason it really needs to?
> setns doesn't but nsenter is going to need to change the cgroup
> if the pinning requirement is kept. nsenenter is going to want to
> change the cgroup if the pinning requirement is dropped.
It seems easy enough for nsenter to change the cgroup all by itself.
More information about the Containers