Why does devices cgroup check for CAP_SYS_ADMIN explicitly?

Tue Nov 6 15:41:05 UTC 2012

Hello,

On Tue, Nov 06, 2012 at 09:30:32AM -0600, Serge Hallyn wrote:
> > So, you don't really have any actual use case for the explicit CAP_*
> > checks, right?
> 
> No, especially since we will now have user namespaces.
> 
> We will want to be able to strictly enforce hierarchical limits - i.e.
> allow uid 100000 (which is uid 0 in the container) to change cgroup
> settings, but never exceed limits set on the parent directory.  IIUC
> you are working toward anyway with the general hierarchy work? (thanks
> for that).

Yeah, I'm working toward that but I'm not sure it would mean that
containers would be able to directly bind mount cgroupfs subdirectory
and have free reign on it.  Maybe such thing can be made to work but I
would be much more comfortable having something inbetween for
impedance matching (in access policies, root cgroup behavior matching,
whatnot).  So, the functionality will be there but it probably would
need something inbetween if you wanna give containers control over its
own cgroup hierarchy.

There are some issues tho.  As it currently stands, devices cgroup
inherits configuration rather than enforcing hierarchy while checking
for access permission.  This means that changes in an ancestor have to
be propagated downwards and *update* configurations of descendants,
which is what I'm working on but it can be confusing for someone
inside the container.  Without breaking compatibility, I don't see any
other way out tho.  I suppose it's something we'll have to live with.

Thanks.

-- 
tejun