[PATCH 1/4] proc.5: Document /proc/[pid]/uid_map and /proc/[pid]/gid_map

Michael Kerrisk (man-pages) mtk.manpages at gmail.com
Fri Dec 28 19:20:15 UTC 2012


Hi Eric,

On Thu, Dec 27, 2012 at 5:58 PM, Eric W. Biederman
<ebiederm at xmission.com> wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages at gmail.com> writes:
>
>> Hi Eric,
>>
>> Thanks for this patch. I have one question and a revised version f the
>> text that I'd like you to review.
>>
>> On Tue, Nov 27, 2012 at 1:46 AM, Eric W. Biederman
>> <ebiederm at xmission.com> wrote:
>>>
>>> Document the user namespace files that report the mapping of uids
>>> and gids between user namespaces.
>>>
>>> Signed-off-by: "Eric W. Biederman" <ebiederm at xmission.com>
>>> ---
>>>  man5/proc.5 |   50 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  1 files changed, 50 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/man5/proc.5 b/man5/proc.5
>>> index fb70d2b..840480d 100644
>>> --- a/man5/proc.5
>>> +++ b/man5/proc.5
>>> @@ -317,6 +317,31 @@ The files in this directory are readable only by the owner of the process.
>>>  .\" .TP
>>>  .\" .IR /proc/[pid]/io " (since kernel 2.6.20)"
>>>  .TP
>>> +.IR /proc/[pid]/gid_map " (since kernel 3.6)"
>>> +This file reports the mapping of gids from the user namespace of the process specified by
>>> +.IR pid
>>> +to the user namespace of the process that opened
>>> +.IR /proc/[pid]/gid_map .
>>> +
>>> +Each line specifies a 1 to 1 mapping of a range of contiguous gids from
>>> +the user namespace of the process specified by
>>> +.IR pid
>>> +to the user namespace of the process that opened
>>> +.IR /proc/[pid]/gid_map.
>>
>> I want to check the above point. What do you mean by "the process that
>> opened uid_map"? Does that mean the process that opened uid_map to do
>> the one-time write of the UID map? I had assumed that uid_map actually
>> provided a mapping between the namespace of 'pid' and the 'parent'
>> namespace, where the parent namespace is the namespace of the process
>> that created this namespace via clone(CLONE_NEWUSER).
>
> I mean the process that opens uid_map for read or write.

Thanks for the confirmation.

> For writing you are correct about the mapping to the parent (but that is
> not an exception that is a restriction on who can write to the file).

So, by the way, I added this sentence to the page:

              In   order   to   write   to   the   /proc/[pid]/uid_map
              (/proc/[pid]/gid_map) file,  a  process  must  have  the
              CAP_SETUID (CAP_SETGID) capability in the user namespace
              of the process pid.

Is that correct?

But, there appear to be more rules than this governing whether a
process can write to the file (i.e., various other -EPERM cases). What
are the rules?

> The complete rule is for the user namespace of the second value is:
>
> - If the user namespace of the opener of the file and the user namespace
>   of the process do not match.  The user namespace of the opener of the
>   file is used.
>
> - If the user namespace of the opener of the file and the user namespace
>   of the process are the same.  The parent user namespace of the process
>   is used for the second value.

Could you give an example of the last case? (What I'm really seeking,
I think, is clarification of "parent user namespace". Does that mean
"user namespace of the process that created the user namespace of this
process"?)


> While very wordy I think the rule makes a lot of intuitive and practical
> sense.  Especially since it is non-trivial to come up with the chain of
> user namespaces a process is in.
>
>>> +Each line contains three numbers.  The start of the range of gids in
>>> +the user namespace of the process specifed by
>>> +.IR pid.
>>> +The start of the range of gids in the user namespace of the process that
>>> +opened
>>> +.IR /proc/[pid]/gid_map.
>>> +The number of gids in the range of numbers that is mapped between to two
>>> +user namespaces.
>>> +
>>> +After the creation of a new user namespace this file may be written to
>>> +exactly once to specify the mapping of gids in the new user namespace.
>>> +
>>> +.TP
>>>  .IR /proc/[pid]/limits " (since kernel 2.6.24)"
>>>  This file displays the soft limit, hard limit, and units of measurement
>>>  for each of the process's resource limits (see
>>> @@ -1169,6 +1194,31 @@ directory are not available if the main thread has already terminated
>>>  (typically by calling
>>>  .BR pthread_exit (3)).
>>>  .TP
>>> +.IR /proc/[pid]/uid_map " (since kernel 3.6)"
>>> +This file reports the mapping of uids from the user namespace of the process specified by
>>> +.IR pid
>>> +to the user namespace of the process that opened
>>> +.IR /proc/[pid]/uid_map .
>>> +
>>> +Each line specifies a 1 to 1 mapping of a range of contiguous uids from
>>> +the user namespace of the process specified by
>>> +.IR pid
>>> +to the user namespace of the process that opened
>>> +.IR /proc/[pid]/uid_map.
>>> +
>>> +Each line contains three numbers.  The start of the range of uids in
>>> +the user namespace of the process specifed by
>>> +.IR pid.
>>> +The start of the range of uids in the user namespace of the process that
>>> +opened
>>> +.IR /proc/[pid]/uid_map.
>>> +The number of uids in the range of numbers that is mapped between to two
>>> +user namespaces.
>>> +
>>> +After the creation of a new user namespace this file may be written to
>>> +exactly once to specify the mapping of uids in the new user namespace.
>>> +
>>> +.TP
>>>  .I /proc/apm
>>>  Advanced power management version and battery information when
>>>  .B CONFIG_APM
>>
>> I revised your text quite a bit, and added a piece on the format od
>> the uid_map files. Could you please read the following and let me know
>> of errors:
>>
>> [[
>>        /proc/[pid]/uid_map, /proc/[pid]/gid_map (since Linux 3.6)
>>               These  files  expose the mappings for user and group IDs
>>               inside the user namespace  for  the  process  pid.   The
>>               description  here  explains  the  details  for  uid_map;
>>               gid_map is exactly the same, but each instance of  "user
>>               ID" is replaced by "group ID".
>>
>>               The  uid_map  file  exposes the mapping of user IDs from
>>               the user namespace of the process pid to the user names‐
>>               pace of the process that opened uid_map.
>>
>>               Each  line  in  the file specifies a 1-to-1 mapping of a
>>               range of contiguous user IDs from the user namespace  of
>>               the  process  pid  to  the user namespace of the process
>>               that opened uid_map.
>>
>>               Each line contains  three  numbers  delimited  by  white
>>               space:
>>
>>               (1) The  start  of  the  range  of  user IDs in the user
>>                   namespace of the process pid.
>>
>>               (2) The start of the range  of  user  IDs  in  the  user
>>                   namespace of the process that opened uid_map.
>>
>>               (3) The  length  of the range of user IDs that is mapped
>>                   between the two user namespaces.
>>
>>               After the creation of a new user  namespace,  this  file
>>               may be written to exactly once to specify the mapping of
>>               user IDs in the new  user  namespace.   (An  attempt  to
>>               write  more  than  once to the file fails with the error
>>               EPERM.)
>>
>>               The lines written to uid_map must conform to the follow‐
>>               ing rules:
>>
>>               *  The  three fields must be valid numbers, and the last
>>                  field must be greater than 0.
>>
>>               *  Lines are terminated by newline characters.
>>
>>               *  The file can contain a maximum of five lines.
>
> A maximum of 5 lines is important to Document but it is a current
> arbitrary limit that may be changed in the future.  Right now 5 extents
> are more than enough for any conceivable use case, and fit nicely within
> a single cache line.
>
> It is probably better to say writes that exceed an arbitrary maximum
> length fail with -EINVAL.  Currently the arbitrary maximum length is
> five lines.

Okay -- reworded.

>
>>               *  The values in both field 1 and field 2 of  each  line
>>                  must be in ascending numerical order.
>
> The rule is that the extents need to be non-overlapping.  Ascending
> numerical order is how that is implemented but that is a misfeature,
> and there has already been one request to fix that.  Removing the
> ascending numerical order limitation is on my todo list.

Okay -- I've reworded some text here.

Thanks,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface"; http://man7.org/tlpi/


More information about the Containers mailing list