[PATCH] Use CAP_SYS_RESOURCE as magic for escaping user namespaces.

Janne Karhunen janne.karhunen at gmail.com
Tue May 7 08:01:29 UTC 2013


Current state of the kernel appears to be that there are more
than 1000 capable() calls and only handful are converted to
ns_capable(). Moreover, it probably does not make any sense
to convert most of these calls to be namespace aware due to
the nature of the physical resources they control, making
'capable()' the right question to ask. Yet, in order to be
able to build 'fully functional real device' like containers,
user namespaces sometimes need the access to real system
resources.

Thus, one potential candidate for enabling access to physical
resources from the user namespace would be to use namespaces
own CAP_SYS_RESOURCE as a magical token for making task
capabilities valid for init_ns.

Signed-off-by: Janne Karhunen <Janne.Karhunen at gmail.com>
---
 kernel/user_namespace.c |    8 ++++++++
 security/commoncap.c    |   18 ++++++++++++++++--
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index d8c30db..f7281fd 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -43,6 +43,14 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
 	key_put(cred->request_key_auth);
 	cred->request_key_auth = NULL;
 #endif
+
+	/* Since CAP_SYS_RESOURCE is the way out of user_ns, we start off having
+	 * it disabled.
+	 */
+	cap_lower (cred->cap_effective, CAP_SYS_RESOURCE);
+	cap_lower (cred->cap_permitted, CAP_SYS_RESOURCE);
+	cap_lower (cred->cap_inheritable, CAP_SYS_RESOURCE);
+
 	/* tgcred will be cleared in our caller bc CLONE_THREAD won't be set */
 	cred->user_ns = user_ns;
 }
diff --git a/security/commoncap.c b/security/commoncap.c
index c44b6fe..cdacb2d 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -83,9 +83,18 @@ int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
 	 * user namespace's parents.
 	 */
 	for (;;) {
-		/* Do we have the necessary capabilities? */
+		/* If we belong in this ns, do we have the capability? */
 		if (ns == cred->user_ns)
 			return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
+		else {
+			/* User_ns asking for rights in init_ns? */
+			if (ns == &init_user_ns) {
+				if (cap_raised(cred->cap_effective, CAP_SYS_RESOURCE))
+					return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
+				else
+					return -EPERM;
+			}
+		}
 
 		/* Have we tried all of the parent namespaces? */
 		if (ns == &init_user_ns)
@@ -481,7 +490,7 @@ int cap_bprm_set_creds(struct linux_binprm *bprm)
 	const struct cred *old = current_cred();
 	struct cred *new = bprm->cred;
 	bool effective, has_cap = false;
-	int ret;
+	int ret, has_res;
 	kuid_t root_uid;
 
 	effective = false;
@@ -501,6 +510,8 @@ int cap_bprm_set_creds(struct linux_binprm *bprm)
 			warn_setuid_and_fcaps_mixed(bprm->filename);
 			goto skip;
 		}
+		has_res = cap_raised(new->cap_permitted, CAP_SYS_RESOURCE);
+
 		/*
 		 * To support inheritance of root-permissions and suid-root
 		 * executables under compatibility mode, we override the
@@ -512,6 +523,9 @@ int cap_bprm_set_creds(struct linux_binprm *bprm)
 			/* pP' = (cap_bset & ~0) | (pI & ~0) */
 			new->cap_permitted = cap_combine(old->cap_bset,
 							 old->cap_inheritable);
+
+			if (!has_res && (old->user_ns != &init_user_ns))
+				cap_lower (new->cap_permitted, CAP_SYS_RESOURCE);
 		}
 		if (uid_eq(new->euid, root_uid))
 			effective = true;
-- 
1.7.9.5



More information about the Containers mailing list