[PATCH] devpts: Add ptmx_uid and ptmx_gid options

Eric W. Biederman ebiederm at xmission.com
Thu May 28 16:44:21 UTC 2015

Andy Lutomirski <luto at amacapital.net> writes:

> On Thu, Apr 2, 2015 at 11:27 AM, Eric W. Biederman
> <ebiederm at xmission.com> wrote:
>> Andy Lutomirski <luto at amacapital.net> writes:
>>> On Thu, Apr 2, 2015 at 7:29 AM, Alexander Larsson <alexl at redhat.com> wrote:
>>>> On Thu, 2015-04-02 at 07:06 -0700, Andy Lutomirski wrote:
>>>>> On Thu, Apr 2, 2015 at 3:12 AM, James Bottomley
>>>>> <James.Bottomley at hansenpartnership.com> wrote:
>>>>> > On Tue, 2015-03-31 at 16:17 +0200, Alexander Larsson wrote:
>>>>> >> On tis, 2015-03-31 at 17:08 +0300, James Bottomley wrote:
>>>>> >> > On Tue, 2015-03-31 at 06:59 -0700, Andy Lutomirski wrote:
>>>>> >> > >
>>>>> >> > > I don't think that this is correct.  That user can already create a
>>>>> >> > > nested userns and map themselves as 0 inside it.  Then they can mount
>>>>> >> > > devpts.
>>>>> >> >
>>>>> >> > I don't mind if they create a container and control the isolated ttys in
>>>>> >> > that sub container in the VPS; that's fine.  I do mind if they get
>>>>> >> > access to the ttys in the VPS.
>>>>> >> >
>>>>> >> > If you can convince me (and the rest of Linux) that the tty subsystem
>>>>> >> > should be mountable by an unprivileged user generally, then what you
>>>>> >> > propose is OK.
>>>>> >>
>>>>> >> That is controlled by the general rights to mount stuff. I.e. unless you
>>>>> >> have CAP_SYS_ADMIN in the VPS container you will not be able to mount
>>>>> >> devpts there. You can only do it in a subcontainer where you got
>>>>> >> permissions to mount via using user namespaces.
>>>>> >
>>>>> > OK let me try again.  Fine, if you want to speak capabilities, you've
>>>>> > given a non-root user an unexpected capability (the capability of
>>>>> > creating a ptmx device).  But you haven't used a capability separation
>>>>> > to do this, you've just hard coded it via a mount parameter mechanism.
>>>>> >
>>>>> > If you want to do this thing, do it properly, so it's acceptable to the
>>>>> > whole of Linux, not a special corner case for one particular type of
>>>>> > container.
>>>>> >
>>>>> > Security breaches are created when people code in special, little used,
>>>>> > corner cases because they don't get as thoroughly tested and inspected
>>>>> > as generally applicable mechanisms.
>>>>> >
>>>>> > What you want is to be able to use the tty subsystem as a non root user:
>>>>> > fine, but set that up globally, don't hide it in containers so a lot
>>>>> > fewer people care.
>>>>> I tend to agree, and not just for the tty subsystem.  This is an
>>>>> attack surface issue.  With unprivileged user namespaces, unprivileged
>>>>> users can create mount namespaces (probably a good thing for bind
>>>>> mounts, etc), network namespaces (reasonably safe by themselves),
>>>>> network interfaces and iptables rules (scary), fresh
>>>>> instances/superblocks of some filesystems (scariness depends on the fs
>>>>> -- tmpfs is probably fine), and more.
>>>>> I think we should have real controls for this, and this is mostly
>>>>> Eric's domain.  Eric?  A silly issue that sometimes prevents devpts
>>>>> from being mountable isn't a real control, though.
>> I thought the controls for limiting how much of the userspace API
>> an application could use were called seccomp and seccomp2.
>> Do we need something like a PAM module so that we can set up these
>> controls during login?
>>>> I'm honestly surprised that non-root is allowed to mount things in
>>>> general with user namespaces. This was long disabled use for non-root in
>>>> Fedora, but it is now enabled.
>>>> For instance, using loopback mounted files you could probably attack
>>>> some of the less well tested filesystem implementations by feeding them
>>>> fuzzed data.
>>> You actually can't do that right now.  Filesystems have to opt in to
>>> being mounted in unprivileged user namespaces, and no filesystems with
>>> backing stores have opted in.  devpts has, but it's buggy without this
>>> patch IMO.
>> Arguably you should use two user namespaces.  The first to do what you
>> want to as root the second to run as the uid you want to run as.
>>>> Anyway, I don't see how this affects devpts though. If you're running in
>>>> a container (or uncontained), as a regular users with no mount
>>>> capabilities you can already mount a devpts filesystem if you create a
>>>> subbcontainer with user namespaces and map your uid to 0 in the
>>>> subcontainer. Then you get a new ptmx device that you can do whatever
>>>> you want with. The mount option would let you do the same, except be
>>>> your regular uid in the subcontainer.
>>>> The only difference outside of the subcontainer is that if the outer
>>>> container has no uid 0 mapped, yet the user has CAP_SYSADMIN rights in
>>>> that container. Then he can mount devpts in the outer container where he
>>>> before could only mount it in an inner container.
>>> Agreed.  Also, devpts doesn't seem scary at all to me from a userns
>>> perspective.  Regular users on normal systems can already use ptmx,
>>> and AFAICS basically all of the attack surface is already available
>>> through the normal /dev/ptmx node.
>> My only real take is that there are a lot more places that you need to
>> tweak beyond devpts.  So this patch seemed lacking and boring.
>> Beyond that until I get the mount namespace sorted out things are pretty
>> much in a feature freeze because I can't multitask well enough to do
>> complicated patches and take feature patches.
> Eric, do you think you have time now to take a look at this patch?

I am much closer.  Escaping bind mounts is still not yet fixed but I
have code that almost works.

My gut feel still says that two user namespaces one where your 0 is
mapped to your uid and a second where your uid is identity mapped is the
preferrable configuration, and makes this patch unnecessary.

I don't think I have heard anyone describe why using a pair of user
namespaces is a problem.

Conceptually as the patch is an efficiency hack on something we can
already do I don't have any semantic grounds to refuse it.  There remain
maintenance concerns (how much else will need this kind of hack) code
code complexity concerns, and is the patch buggy concerns.


More information about the Containers mailing list