[PATCH 1/8] kernel/exit.c: make sure current's nsproxy != NULL while checking caps

Lukasz Pawelczyk l.pawelczyk at samsung.com
Mon May 25 11:33:18 UTC 2015


On sob, 2015-05-23 at 12:49 -0500, Eric W. Biederman wrote:
> Lukasz Pawelczyk <l.pawelczyk at samsung.com> writes:
> 
> > There is a rare case where current's nsproxy might be NULL but we are
> > required to check for credentials and capabilities. It sometimes happens
> > during an exit_group() syscall while destroying user's session (logging
> > out).
> >
> > My understanding is that while we have to lock the task to get task's
> > nsproxy and check whether it's NULL, for the 'current' we don't have to
> > and it's expected not to be NULL. There is a code in the kernel
> > currently that does current->nsproxy->user_ns without any checks.
> > And include/linux/nsproxy.h confirms that:
> >
> > 2. when accessing (i.e. reading) current task's namespaces - no
> >    precautions should be taken - just dereference the pointers
> >
> > There seem to be no crash currently because of this, but with accessing
> > nsproxy from LSM hooks there is. This is the backtrace:
> >
> > 0  smk_tskacc (task=0xffff88003b0b92e0, obj_known=0x2 <irq_stack_union+2>, mode=2, a=0xffff88003be53dd8) at security/smack/smack_access.c:261
> > 1  0xffffffff8130e2aa in smk_curacc (obj_known=<optimized out>, mode=<optimized out>, a=<optimized out>) at security/smack/smack_access.c:318
> > 2  0xffffffff8130a50d in smack_task_kill (p=0xffff88003b0b92e0, info=<optimized out>, sig=<optimized out>, secid=<optimized out>) at security/smack/smack_lsm.c:2071
> > 3  0xffffffff812ea4f6 in security_task_kill (p=<optimized out>, info=<optimized out>, sig=<optimized out>, secid=<optimized out>) at security/security.c:952
> > 4  0xffffffff8109ac80 in check_kill_permission (sig=15, info=0x0 <irq_stack_union>, t=0xffff88003b0b8000) at kernel/signal.c:796
> > 5  0xffffffff8109d3ab in group_send_sig_info (sig=15, info=0x0 <irq_stack_union>, p=0xffff88003b0b8000) at kernel/signal.c:1296
> > 6  0xffffffff8108e527 in forget_original_parent (father=<optimized out>) at kernel/exit.c:575
> > 7  exit_notify (group_dead=<optimized out>, tsk=<optimized out>) at kernel/exit.c:606
> > 8  do_exit (code=<optimized out>) at kernel/exit.c:775
> > 9  0xffffffff8108ec0f in do_group_exit (exit_code=0) at kernel/exit.c:891
> > 10 0xffffffff8108ec84 in SYSC_exit_group (error_code=<optimized out>) at kernel/exit.c:902
> > 11 SyS_exit_group (error_code=<optimized out>) at kernel/exit.c:900
> >
> > This backtrace clearly shows that there is an LSM hook task_kill() that
> > happens during an exit_group() syscall and that this happens after
> > exit_task_namespaces(). LSM hooks with namespaces might need nsproxy to
> > be able to check for capabilities. At this point this is impossible. The
> > current's nsproxy is already NULL/destroyed.
> >
> > This is the case because exit_task_namespaces() is called before the
> > exit_notify() where all of the above happens. This patch changes their
> > order.
> 
> Nacked-by: "Eric W. Biederman" <ebiederm at xmission.com>
> 
> current->nsproxy->user_ns does not exist,
> and changing where exit_task_namespaces is fragile and I am really not
> interested in messing with it right now, to solve a problem that does
> not exist.

I must have missed the moment where current->nsproxy->user_ns was
removed. I obviously even don't use it in my patches anymore (replaced
with cred->user_ns).

Back when I started to write my patches and wanted to use
current->nsproxy->user_ns in LSM hooks the problem was real.

Fortunately current->cred->user_ns does not exhibit the same issue. I'll
drop this patch.

Sorry for the confusion.


> 
> >
> > Signed-off-by: Lukasz Pawelczyk <l.pawelczyk at samsung.com>
> > ---
> >  kernel/exit.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/exit.c b/kernel/exit.c
> > index 22fcc05..da1bb18 100644
> > --- a/kernel/exit.c
> > +++ b/kernel/exit.c
> > @@ -742,7 +742,6 @@ void do_exit(long code)
> >  	exit_fs(tsk);
> >  	if (group_dead)
> >  		disassociate_ctty(1);
> > -	exit_task_namespaces(tsk);
> >  	exit_task_work(tsk);
> >  	exit_thread();
> >  
> > @@ -763,6 +762,13 @@ void do_exit(long code)
> >  
> >  	TASKS_RCU(tasks_rcu_i = __srcu_read_lock(&tasks_rcu_exit_srcu));
> >  	exit_notify(tsk, group_dead);
> > +
> > +	/*
> > +	 * This should be after all things that potentially require
> > +	 * process's namespaces (e.g. capability checks).
> > +	 */
> > +	exit_task_namespaces(tsk);
> > +
> >  	proc_exit_connector(tsk);
> >  #ifdef CONFIG_NUMA
> >  	task_lock(tsk);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Lukasz Pawelczyk
Samsung R&D Institute Poland
Samsung Electronics





More information about the Containers mailing list