[CFT][PATCH 00/10] Making new mounts of proc and sysfs as safe as bind mounts (take 2)

Andy Lutomirski luto at amacapital.net
Fri May 29 17:49:59 UTC 2015


On Thu, May 28, 2015 at 9:36 PM, Eric W. Biederman
<ebiederm at xmission.com> wrote:
> Andy Lutomirski <luto at amacapital.net> writes:
>> On May 28, 2015 12:19 PM, "Eric W. Biederman" <ebiederm at xmission.com> wrote:
>>> Kenton Varda <kenton at sandstorm.io> writes:
>>>
>>> We do need to enforce retaining the existing mount flags one way or
>>> another.  Where this really matters is with MS_RDONLY.  We don't want
>>> any old user to be able to mount /proc read-write when root mounted it
>>> read-only.  There is a very real attack vector there.  That attack
>>> almost works in docker container today and is avoided simply because
>>> docker mounts over a few files on proc.
>>
>> You could drop the nosuid, noexec, and nodev changes and keep just the
>> ro part.  The ro part is probably not an ABI break in the sense of
>> something that actually breaks real programs.
>
> As a change simply removing the code from the existing patches that
> worries about nosuid, noexec, and the nodev flags is certainly doable.
> It is the best proposal I have heard so far.
>
> I remain unconvinced about ignoring those flags:
> - There are clearly people who think it matters (or else proc and sysfs
>   would not have those flags specified).
>
> - There have been times when it actually has mattered.
>   Aka when files like /proc/self/env could be chmodded and used for
>   privilege escalation.
>
> - The code in lxc and libvirt-lxc so far has been clearly buggy.
>   * lxc only has problems with sysfs (in some configurations).
>   * libvirt-lxc only has problems on a bind mount remount of
>     proc after remounting proc properly.
>
> So I am leaning towards enforcing all of the mount flags including
> nosuid, noexec, and nodev.  Then when the next subtle bug in proc or
> sysfs with respect to chmod shows up I will be able to sleep soundly at
> night because the mount flags of those filesystems allow a mitigation,
> and I did not sabatage the mitigation.

One option would be to break the nosuid, nodev, and noexec parts into
their own patch and then avoid tagging that patch for -stable if at
all possible.  It would be nice to avoid another -stable ABI break if
at all possible.

--Andy


More information about the Containers mailing list