[PATCH 1/8] kernel/exit.c: make sure current's nsproxy != NULL while checking caps

Lukasz Pawelczyk l.pawelczyk at samsung.com
Thu May 21 11:53:35 UTC 2015


There is a rare case where current's nsproxy might be NULL but we are
required to check for credentials and capabilities. It sometimes happens
during an exit_group() syscall while destroying user's session (logging
out).

My understanding is that while we have to lock the task to get task's
nsproxy and check whether it's NULL, for the 'current' we don't have to
and it's expected not to be NULL. There is a code in the kernel
currently that does current->nsproxy->user_ns without any checks.
And include/linux/nsproxy.h confirms that:

2. when accessing (i.e. reading) current task's namespaces - no
   precautions should be taken - just dereference the pointers

There seem to be no crash currently because of this, but with accessing
nsproxy from LSM hooks there is. This is the backtrace:

0  smk_tskacc (task=0xffff88003b0b92e0, obj_known=0x2 <irq_stack_union+2>, mode=2, a=0xffff88003be53dd8) at security/smack/smack_access.c:261
1  0xffffffff8130e2aa in smk_curacc (obj_known=<optimized out>, mode=<optimized out>, a=<optimized out>) at security/smack/smack_access.c:318
2  0xffffffff8130a50d in smack_task_kill (p=0xffff88003b0b92e0, info=<optimized out>, sig=<optimized out>, secid=<optimized out>) at security/smack/smack_lsm.c:2071
3  0xffffffff812ea4f6 in security_task_kill (p=<optimized out>, info=<optimized out>, sig=<optimized out>, secid=<optimized out>) at security/security.c:952
4  0xffffffff8109ac80 in check_kill_permission (sig=15, info=0x0 <irq_stack_union>, t=0xffff88003b0b8000) at kernel/signal.c:796
5  0xffffffff8109d3ab in group_send_sig_info (sig=15, info=0x0 <irq_stack_union>, p=0xffff88003b0b8000) at kernel/signal.c:1296
6  0xffffffff8108e527 in forget_original_parent (father=<optimized out>) at kernel/exit.c:575
7  exit_notify (group_dead=<optimized out>, tsk=<optimized out>) at kernel/exit.c:606
8  do_exit (code=<optimized out>) at kernel/exit.c:775
9  0xffffffff8108ec0f in do_group_exit (exit_code=0) at kernel/exit.c:891
10 0xffffffff8108ec84 in SYSC_exit_group (error_code=<optimized out>) at kernel/exit.c:902
11 SyS_exit_group (error_code=<optimized out>) at kernel/exit.c:900

This backtrace clearly shows that there is an LSM hook task_kill() that
happens during an exit_group() syscall and that this happens after
exit_task_namespaces(). LSM hooks with namespaces might need nsproxy to
be able to check for capabilities. At this point this is impossible. The
current's nsproxy is already NULL/destroyed.

This is the case because exit_task_namespaces() is called before the
exit_notify() where all of the above happens. This patch changes their
order.

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk at samsung.com>
---
 kernel/exit.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 22fcc05..da1bb18 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -742,7 +742,6 @@ void do_exit(long code)
 	exit_fs(tsk);
 	if (group_dead)
 		disassociate_ctty(1);
-	exit_task_namespaces(tsk);
 	exit_task_work(tsk);
 	exit_thread();
 
@@ -763,6 +762,13 @@ void do_exit(long code)
 
 	TASKS_RCU(tasks_rcu_i = __srcu_read_lock(&tasks_rcu_exit_srcu));
 	exit_notify(tsk, group_dead);
+
+	/*
+	 * This should be after all things that potentially require
+	 * process's namespaces (e.g. capability checks).
+	 */
+	exit_task_namespaces(tsk);
+
 	proc_exit_connector(tsk);
 #ifdef CONFIG_NUMA
 	task_lock(tsk);
-- 
2.1.0



More information about the Containers mailing list