For review (v2): user_namespaces(7) man page
luto at amacapital.net
Mon Apr 29 20:21:18 UTC 2013
On Thu, Apr 25, 2013 at 10:48 PM, richard -rw- weinberger
<richard.weinberger at gmail.com> wrote:
> On Fri, Apr 26, 2013 at 2:54 AM, Eric W. Biederman
> <ebiederm at xmission.com> wrote:
>> richard -rw- weinberger <richard.weinberger at gmail.com> writes:
>>> On Wed, Mar 27, 2013 at 10:26 PM, Michael Kerrisk (man-pages)
>>> <mtk.manpages at gmail.com> wrote:
>>>> Inside the user namespace, the shell has user and group ID 0,
>>>> and a full set of permitted and effective capabilities:
>>>> bash$ cat /proc/$$/status | egrep '^[UG]id'
>>>> Uid: 0 0 0 0
>>>> Gid: 0 0 0 0
>>>> bash$ cat /proc/$$/status | egrep '^Cap(Prm|Inh|Eff)'
>>>> CapInh: 0000000000000000
>>>> CapPrm: 0000001fffffffff
>>>> CapEff: 0000001fffffffff
>>> I've tried your demo program, but inside the new ns I'm automatically nobody.
>>> As Eric said, setuid(0)/setgid(0) are missing.
>> Is it the setuid/setgid or not setting up the uid/gid map?
> uid/git mapping are set up.
>>> Eric, maybe you can help me. How can I drop capabilities within a user
>>> In childFunc() I did add prctl(PR_CAPBSET_DROP, CAP_NET_ADMIN) but it always
>>> returns ENOPERM.
>>> What that? I thought I get a completely fresh set of cap which I can modify.
>>> I don't want that uid 0 inside the container has all caps.
>> There are weird things that happen with exec and the user namespace. If
>> you have exec'd as an unmapped user all of your capabilities have
>> already been droped.
> I've setup the mappings. If I look into /proc/*/status I see that my process has
> all caps.
> So, in general it is possible to drop cap within a user namespace?
> I really want to drop CAP_NET_ADMIN and some others.
> root within my container must not change any networking settings.
You may have the common issue that uid 0 tends to regain capabilities
on exec due to "legacy" capability emulation. Try playing with
securebits and/or the bounding set. (The setpriv command in very new
util-linux-ng makes this easy to play with.)
Note that you almost certainly want to set no_new_privs if anything
other than uid 0 is running with non-default securebits.
More information about the Containers