[PATCH RFC] user-namespaced file capabilities - now with more magic

Mimi Zohar zohar at linux.vnet.ibm.com
Thu May 19 20:53:56 UTC 2016


On Wed, 2016-05-18 at 16:57 -0500, Serge E. Hallyn wrote:
> This patch introduces a new security.nscapability xattr.  It
> is mostly like security.capability, but also lists a 'rootid'.
> This is the uid_t (in init_user_ns) of the root id (uid 0 in a
> namespace) in whose namespaces the file capabilities may take
> effect.
> 
> A privileged (cap_setfcap) process in the initial user ns may
> set and read this xattr directly.  However, its real intent is
> to be used as a transparent fallback in user namespaces.
> 
> Root in a user ns cannot be trusted to write security.capability
> xattrs, because any user on the host could map his own uid to root
> in a namespace, write the xattr, and execute the file with privilege
> on the host.
> 
> With this patch, when root in a user ns asks to write security.capability,
> the kernel will transparently write a security.nscapability xattr
> instead, filling in the kuid of the calling user's root uid.  Subsequently,
> any task executing the file which has the noted k_uid as its root uid,
> or which is in a descendent user_ns of such a user_ns, will run the
> file with capabilities.
> 
> When reading the security.capability xattr from a non-init user_ns, a valid
> security.nscapability will be shown if it exists.  Such a task is not
> allowed to read security.nscapability.  This could be accomodated, however

Add the word "directly" as "to read security.nscapability directly".

> it requires the kernel to convert the kuid_t to a valid uid in the reader's
> user_ns.  So for now it's simply not supported.

I really like the idea that the kernel transparently replaces
nscapability for capability.

> Only a single security.nscapability xattr may be written.  This patch
> could be expanded to allow a list of capabilities and rootids, however
> I do not believe that to be a worthwhile use case.

Ok

> This allows a simple setxattr to work, allows tar/untar to
> work, and allows us to tar in one namespace and untar in
> another while preserving the capability, without risking
> leaking privilege into a parent namespace.
> 
> Note - listxattr is not being handled here.  So results of that can be
> inconsistent with get/setxattr.  Fixing that will require yet more
> deceit in fs/xattr.c.
> 
> Note2 - it may be less sneaky to hide all the magic behind the
> security.nscapability xattr.  So userspace would need to know to
> use that xattr name when needed, but with the same format as
> security.capability.  The kuid_t rootid would be filled in by the
> kernel.  That's a middle ground between my last patch and this one.

The less userspace needs to differentiate between running in a namespace
and not, the better.

Note3 - capability is currently protected by EVM, when enabled.  Should
ns_capability also be a protected xattr?

> Signed-off-by: Serge Hallyn <serge.hallyn at ubuntu.com>
> ---
>  fs/xattr.c                      |  18 ++-
>  include/linux/capability.h      |   8 +-
>  include/uapi/linux/capability.h |  19 +++
>  include/uapi/linux/xattr.h      |   3 +
>  security/commoncap.c            | 253 ++++++++++++++++++++++++++++++++++++++--
>  5 files changed, 291 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/xattr.c b/fs/xattr.c
> index 4861322..5c0e7ae 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -94,11 +94,26 @@ int __vfs_setxattr_noperm(struct dentry *dentry, const char *name,
>  {
>  	struct inode *inode = dentry->d_inode;
>  	int error = -EOPNOTSUPP;
> +	void *wvalue = NULL;
> +	size_t wsize = 0;
>  	int issec = !strncmp(name, XATTR_SECURITY_PREFIX,
>  				   XATTR_SECURITY_PREFIX_LEN);
> 
> -	if (issec)
> +	if (issec) {
>  		inode->i_flags &= ~S_NOSEC;
> +		/* if root in a non-init user_ns tries to set
> +		 * security.capability, write a security.nscapability
> +		 * in its place */
> +		if (!strcmp(name, "security.capability") &&
> +				current_user_ns() != &init_user_ns) {
> +			cap_setxattr_make_nscap(dentry, value, size, &wvalue, &wsize);
> +			if (!wvalue)
> +				return -EPERM;
> +			value = wvalue;
> +			size = wsize;
> +			name = "security.nscapability";
> +		}

The call to capable_wrt_inode_uidgid() is hidden behind
cap_setxattr_make_nscap().  Does it make sense to call it here instead,
before the security.capability test?  This would lay the foundation for
doing something similar for IMA.

(Will continue reviewing ...)

Mimi



More information about the Containers mailing list