Interaction user namespace, /proc/1 ownership & cap_set
richard at nod.at
Tue Jul 2 08:56:37 UTC 2013
Am 02.07.2013 10:44, schrieb Eric W. Biederman:
> Gao feng <gaofeng at cn.fujitsu.com> writes:
>> On 07/02/2013 12:16 AM, Daniel P. Berrange wrote:
>>> I'm struggling debugging a strange problem with interaction between user
>>> namespaces, cap_set and ownership of files in /proc/1/
>> This problem is occured after we call setuid/gid.
>> for example, a task whose pid is 1234 calls
>> The uid/gid of the /proc/1234 is 10:0
>> ll /proc/1234 -d
>> dr-xr-xr-x 8 uucp wheel 0 Jul 2 10:57 /proc/1234
>> the uid/gid of the files under /proc/1234 are two kinds...
>> ll /proc/1234
>> dr-xr-xr-x 2 uucp wheel 0 Jul 2 10:58 attr
>> -rw-r--r-- 1 root root 0 Jul 2 10:58 autogroup
>> dr-xr-xr-x 5 uucp wheel 0 Jul 2 10:58 net
>> dr-x--x--x 2 root root 0 Jul 2 10:58 ns
>> dr-xr-xr-x 3 uucp wheel 0 Jul 2 10:58 task
>> I checked the pre_revalidate and found the owner of the files under /proc/<pid>
>> will be set to the GLOBAL_ROOT_UID if the task executed setuid/setgid(task_dumpable is false).
>> Is this what we expected? why?
> Expected yes. Perfect perhaps not.
> That piece of code has not been examined to see if it is safe to use
> make_kuid(task_user_ns(task), 0), instead of GLOBAL_ROOT_UID.
>> For user namespace,the owner of /proc/1/* is incorrect and
>> after task call setuid/gid in user namespace, the owner of /proc/<pid-of-this-task>/* is incorrect
> From the current semantics of dumpable GLOBAL_ROOT_UID is correct.
> Please double check but I believe /proc/self should continue to work,
> despite this.
/proc/self is not an option. systemd (in particular some of it's tools with pid != 1) read from /proc/1/environ to find out
what environment variables it got to detect LXC and other visualization environments.
With userns enabled this check fails and systemd goes nuts because it thinks that it lives on top of a "real" Linux.
More information about the Containers