[PATCH 1/2] proc.5: Document /proc/[pid]/setgroups

Michael Kerrisk (man-pages) mtk.manpages at gmail.com
Thu Feb 12 13:53:29 UTC 2015


Hello Eric,

On 02/11/2015 02:51 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages at gmail.com> writes:
> 
>> Hi Eric,
>>
>> Ping!
>>
>> Cheers,
>>
>> Michael
> 
> My apologies.  You description wasn't wrong but it may be a bit
> misleading, explanation below.  You will have to figure out how to work
> that into your proposed text.
> 
>> On 2 February 2015 at 16:36, Michael Kerrisk (man-pages)
>> <mtk.manpages at gmail.com> wrote:
>>> [Adding Josh to CC in case he has anything to add.]
>>>
>>> On 12/12/2014 10:54 PM, Eric W. Biederman wrote:
>>>>
>>>> Signed-off-by: Eric W. Biederman <ebiederm at xmission.com>
>>>> ---
>>>>  man5/proc.5 | 15 +++++++++++++++
>>>>  1 file changed, 15 insertions(+)
>>>>
>>>> diff --git a/man5/proc.5 b/man5/proc.5
>>>> index 96077d0dd195..d661e8cfeac9 100644
>>>> --- a/man5/proc.5
>>>> +++ b/man5/proc.5
>>>> @@ -1097,6 +1097,21 @@ are not available if the main thread has already terminated
>>>>  .\"       Added in 2.6.9
>>>>  .\"       CONFIG_SCHEDSTATS
>>>>  .TP
>>>> +.IR /proc/[pid]/setgroups " (since Linux 3.19-rc1)"
>>>> +This file reports
>>>> +.BR allow
>>>> +if the setgroups system call is allowed in the current user namespace.
>>>> +This file reports
>>>> +.BR deny
>>>> +if the setgroups system call is not allowed in the current user namespace.
>>>> +This file may be written to with values of
>>>> +.BR allow
>>>> +and
>>>> +.BR deny
>>>> +before
>>>> +.IR /proc/[pid]/gid_map
>>>> +is written to (enabling setgroups) in a user namespace.
>>>> +.TP
>>>>  .IR /proc/[pid]/smaps " (since Linux 2.6.14)"
>>>>  This file shows memory consumption for each of the process's mappings.
>>>>  (The
>>>
>>> Hi Eric,
>>>
>>> Thanks for this patch. I applied it, and then tried to work in
>>> quite a few other details gleaned from the source code and commit
>>> message, and Jon Corbet's article at http://lwn.net/Articles/626665/.
>>> Could you please let me know if the following is correct:
> 
> It is close but it may be misleading.
> 
>>>     /proc/[pid]/setgroups (since Linux 3.19)
>>>            This file displays the string "allow"  if  processes  in
>>>            the  user  namespace  that  contains the process pid are
>>>            permitted to employ the setgroups(2)  system  call,  and
>>>            "deny"  if  setgroups(2)  is  not permitted in that user
>>>            namespace.
> 
> With the caveat that when gid_map is not set that setgroups is also not
> allowed.

Okay -- Iadded that point.

>>>            A privileged process (one with the  CAP_SYS_ADMIN  capa‐
>>>            bility in the namespace) may write either of the strings
>>>            "allow" or "deny" to this file before writing a group ID
>>>            mapping   for   this   user   namespace   to   the  file
>>>            /proc/[pid]/gid_map.  Writing the string "deny" prevents
>>>            any  process  in  the user namespace from employing set‐
>>>            groups(2).
> 
> Or more succintly.  You are allowed to write to /proc/[pid]/setgroups
> when calling setgroups is not allowed because gid_map is unset.  This
> ensures we do not have any transitions from a state where setgroups
> is allowed to a state where setgroups is denied.  There are only
> transitions from setgroups not-allowed to setgroups allowed.

And I've worked in the above point, rewording a bit along the way.
So, how does the following look (only the first two paragraphs have
changed)?

       /proc/[pid]/setgroups (since Linux 3.19)
              This file displays the string "allow"  if  processes  in
              the  user  namespace  that  contains the process pid are
              permitted to employ the setgroups(2)  system  call,  and
              "deny"  if  setgroups(2)  is  not permitted in that user
              namespace.  (Note, however, that calls  to  setgroups(2)
              are  also  not  permitted if /proc/[pid]/gid_map has not
              yet been set.)

              A privileged process (one with the  CAP_SYS_ADMIN  capa‐
              bility in the namespace) may write either of the strings
              "allow" or "deny" to this file before writing a group ID
              mapping   for   this   user   namespace   to   the  file
              /proc/[pid]/gid_map.  Writing the string "deny" prevents
              any  process  in  the user namespace from employing set‐
              groups(2).  In other words, it is permitted to write  to
              /proc/[pid]/setgroups so long as calling setgroups(2) is
              not allowed because /proc/[pid]gid_map has not been set.
              This  ensures  that  a  process cannot transition from a
              state where setgroups(2) is allowed  to  a  state  where
              setgroups(2)  is  denied;  a process can only trabsition
              from setgroups(2) being disallowed to setgroups(2) being
              allowed.

              The  default  value  of  this  file  in the initial user
              namespace is "allow".

              Once /proc/[pid]/gid_map has been written to (which  has
              the  effect  of enabling setgroups(2) in the user names‐
              pace), it is no longer possible to deny setgroups(2)  by
              writing to /proc/[pid]/setgroups.

              A  child user namespace inherits the /proc/[pid]/gid_map
              setting from its parent.

              If the setgroups file has the  value  "deny",  then  the
              setgroups(2) system call can't subsequently be reenabled
              (by writing "allow" to the file) in this user namespace.
              This  restriction also propagates down to all child user
              namespaces of this user namespace.

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/


More information about the Containers mailing list