[PATCH 2/6] User namespace: don't allow sysctl in non-init user ns (v2)

Serge Hallyn serge at hallyn.com
Fri Nov 4 22:24:38 UTC 2011


From: Serge Hallyn <serge.hallyn at canonical.com>

sysctl.c has its own custom uid check, which is not user namespace
aware.  As discovered by Richard, that allows root in a container
privileged access to set all sysctls.

To fix that, don't compare uid or groups if current is not in the
initial user namespace.  We may at some point want to relax that check
so that some sysctls are allowed - for instance dmesg_restrict when
syslog is containerized.

Changelog:
Sep 22: As Miquel van Smoorenburg pointed out, rather than always
	refusing access if not in initial user_ns, we should allow
	world access rights to sysctl files.  We just want to prevent
	a task in a non-init user namespace from getting the root user
	or group access rights.

Signed-off-by: Serge Hallyn <serge.hallyn at canonical.com>
Cc: "Eric W. Biederman" <ebiederm at xmission.com>
Cc: Vasiliy Kulikov <segoon at openwall.com>
Cc: richard at nod.at
Cc: Miquel van Smoorenburg <mikevs at xs4all.net>
---
 kernel/sysctl.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index ae27196..473df41 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1708,10 +1708,12 @@ void register_sysctl_root(struct ctl_table_root *root)
 
 static int test_perm(int mode, int op)
 {
-	if (!current_euid())
-		mode >>= 6;
-	else if (in_egroup_p(0))
-		mode >>= 3;
+	if (current_user_ns() == &init_user_ns) {
+		if (!current_euid())
+			mode >>= 6;
+		else if (in_egroup_p(0))
+			mode >>= 3;
+	}
 	if ((op & ~mode & (MAY_READ|MAY_WRITE|MAY_EXEC)) == 0)
 		return 0;
 	return -EACCES;
-- 
1.7.0.4



More information about the Containers mailing list