LPC 2020 Hackroom Session: summary and next steps for isolated user namespaces

Snaipe snaipe at arista.com
Wed Apr 21 17:27:14 UTC 2021

"Giuseppe Scrivano" <gscrivan at redhat.com> writes:
>>> >> instead of a prctl, I've added a new mode to /proc/PID/setgroups that
>>> >> allows setgroups in a userns locking the current gids.
>>> >> 
>>> >> What do you think about using /proc/PID/setgroups instead of a new
>>> >> prctl()?
>>> >
>>> > It's better than not having it, but two concerns -
>>> >
>>> > 1. some userspace, especially testsuites, could become confused by the fact
>>> > that they can't drop groups no matter how hard they try, since these will all
>>> > still show up as regular groups.
>>> I forgot to send a link to a second patch :-) that completes the feature:
>>> https://github.com/giuseppe/linux/commit/1c5fe726346b216293a527719e64f34e6297f0c2
>>> When the new mode is used, the gids that are not known in the userns do
>>> not show up in userspace.
>> Ah, right - and of course those gids better not be mapped into the namespace :)
>> But so, this is the patch you said you agreed was not worth the extra
>> complexity?
> yes, these two patches are what looked too complex at that time.  The
> problem still exists though, we could perhaps reconsider if the
> extra-complexity is acceptable to address it.

Hey Folks, sorry for necro-bumping, but I've found this discussion
while searching for this specific issue, and it seems like the most
recent relevant discussion on the matter. I'd like to chime in with
our personal experience.

We have a tool[1] that allows unprivileged use of namespaces
(when using a userns, which is the default).

The primary use-case of said tool is lightweight containerization,
but we're also using it for other mundane usages, like a better
substitute for fakeroot to build and package privileged software
(e.g. sudo or ping, which needs to be installed with special
capabilities) unprivileged, or to copy file trees that are owned by
the user or sub-ids.

For the first use-case, it's always safe to drop unmapped groups,
because the target rootfs is always owned by the user or its sub-ids.

For the other use-cases, this is more problematic, as you're all
well-aware of. Our position right now is that the tool will always
allow setgroups in user namespace, and that it's not safe to use on
systems that rely on negative access groups.

I think that something that's not mentioned is that if a user setgroups
to a fixed list of subgids, dropping all unmapped gids, they don't just
gain the ability to access these negative-access files, they also lose
legitimate access to files that their unmapped groups allow them to
access. This is fine for our first use-case, but a bit surprising for
the second one -- and since setgroups never lets us keep unmapped gids,
we have no way to keep these desired groups.

>From a first glance, a sysctl that explicitly controls that would not
address the above problem, but keeping around the original group list
of the owner of the user ns would have the desired semantics.

Giuseppe's patch seems to address this use case, which would personally
make me very happy.

[1]: https://github.com/aristanetworks/bst


More information about the Containers mailing list