[PATCH] Revert "vfs: Allow userns root to call mknod on owned filesystems."

Christian Brauner christian at brauner.io
Thu Jul 5 17:34:29 UTC 2018


On Thu, Jul 05, 2018 at 11:48:11AM -0500, Eric W. Biederman wrote:
> 
> Nacked-by: "Eric W. Biederman" <ebiederm at xmission.com>
> 
> Your description is usesless.
> 
> It needs to detail exactly what breaks, what regressions and why.
> All I see below is hand waving.
> 
> We need to know why this does not work so someone does not come in and try
> this again.  Or so that someone can fix this and then try again.
> 
> You do not include that kind of information in your commit log.

My commit log explicitly states that if you run systemd services in a
user namespace with PrivateDevices=true it will fail as soon as anything
tries to open such a device node. Before that change this worked just
fine. The commit log also leads to the related thread.
I can come up with a list of services that fail to start if that helps.

> 
> Calling mknod to create device nodes can not be widespread.  There are

Well, there are a few. There's the container runtimes (aka
systemd-nspawn, rkt, runC, LXC, LXD), udev, systemd, openrc to name just
a few.
Recently we even worked on udev being useable in user namespaces.

Please also note, that a lot of applications were also switched to
fallback to bind-mounts on mknod() permission failures since this was
the easiest and least costly way to deal with all of the LSMs, user
namespaces, capability dropping, and seccomp. They all would need way
more complex logic to decide whether to fallback to a bind-mount or not.

> not that many privileged processes and calling mknod outside of being
> a specialed process like udev is broken.
> 
> Therefore I refute your assertion that this is a widespread issue.
> 
> 
> I expect somewhere there is a reasonable argument for reverting this
> change on the basis that it causes a regression. You have not made it.

Fair enough, I can rewrite the commit message and focs on the container
workload and container runtime regressions as a clear example if that
seems a sufficient argument to you.

> 
> Until that time I am going to oppose this revert because your
> justfication for the revert is lacking.

I sympathize with you wanting a proper and thorough justification and
I'm sorry if I apparently did not provided exhaustive details.
However, I think (see above) that I've provided at least a sufficient
argument in my commit log to start a reasonable discussion about this
that doesn't end with "You're saying the kernel is broken. I'm saying
userspace is broken.".

> 
> 
> It has never been the case that mknod on a device node will guarantee
> that you even can open the device node.  The applications that regress
> are broken.  It doesn't mean we shouldn't be bug compatible, but we darn

It seems a fair assumption to me that an object you created (with the
_right permissions_) you can also interact with.
Also if I may retort, I see no good argument why the applications are
broken.

> well should document very clearly the bugs we are being bug compatible
> with.
> 
> Eric
> _______________________________________________
> Containers mailing list
> Containers at lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers


More information about the Containers mailing list