userns idea: preventing SCM_CREDENTIALS from leaking out

Andy Lutomirski luto at amacapital.net
Wed Nov 27 17:54:37 UTC 2013


On Wed, Nov 27, 2013 at 8:56 AM, Miklos Szeredi <miklos at szeredi.hu> wrote:
> On Wed, Nov 27, 2013 at 5:24 PM, Andy Lutomirski <luto at amacapital.net> wrote:
>>> Actually an option to aufs and overlayfs to say "any unix domain socket
>>> which is opened must first be copied to the writeable layer" would
>>> solve the issue (at least for all reasonable cases, iiuc)
>>
>> I guess I'm reasonably convinced that overlayfs is the right place to
>> fix this.  (Containers using lvm will be left in the cold -- oh,
>> well.)
>>
>> cc: Miklos, who is the most likely to implement one or both of these features.
>
> AFAICS implementing the option to copy up a unix domain socket on open
> is trivial:  just need to tweak ovl_open_need_copy_up().
>
> Is that what you were thinking?

I'm not familiar enough w/ overlayfs.  I think the desired semantics
would be that a socket in the overlay mount would be a different inode
than the socket in the bottom underlying fs (or whatever it's called).

>
>> (In cases where containers share a (non-overlay) directory that one of
>> them can write, would it make sense to have an option MS_NOSOCKET that
>> works on bind mounts?)
>
> Isn't it "you can't send SCM_CREDENTIALS", rather than "you can't open
> unix domain socket"?
>

The latter may be considerably harder to implement.  (There's
SO_PEERCRED, too, and I don't know if there's a good place to stick a
flag for this in an open socket.)

I think the ideal solution here is to have non-overlapping uid ranges,
and an option in overlayfs to remap uids and gids would make this
possible, at least if overlayfs is in use.

--Andy


More information about the Containers mailing list