[PATCH RESEND 2/5] seccomp: Add wait_killable semantic to seccomp user notifier

Sargun Dhillon sargun at sargun.me
Mon Apr 26 22:15:28 UTC 2021


On Mon, Apr 26, 2021 at 01:02:29PM -0600, Tycho Andersen wrote:
> On Mon, Apr 26, 2021 at 11:06:07AM -0700, Sargun Dhillon wrote:
> > @@ -1103,11 +1111,31 @@ static int seccomp_do_user_notification(int this_syscall,
> >  	 * This is where we wait for a reply from userspace.
> >  	 */
> >  	do {
> > +		interruptible = notification_interruptible(&n);
> > +
> >  		mutex_unlock(&match->notify_lock);
> > -		err = wait_for_completion_interruptible(&n.ready);
> > +		if (interruptible)
> > +			err = wait_for_completion_interruptible(&n.ready);
> > +		else
> > +			err = wait_for_completion_killable(&n.ready);
> >  		mutex_lock(&match->notify_lock);
> > -		if (err != 0)
> > +
> > +		if (err != 0) {
> > +			/*
> > +			 * There is a race condition here where if the
> > +			 * notification was received with the
> > +			 * SECCOMP_USER_NOTIF_FLAG_WAIT_KILLABLE flag, but a
> > +			 * non-fatal signal was received before we could
> > +			 * transition we could erroneously end our wait early.
> > +			 *
> > +			 * The next wait for completion will ensure the signal
> > +			 * was not fatal.
> > +			 */
> > +			if (interruptible && !notification_interruptible(&n))
> > +				continue;
> 
> I'm trying to understand how one would hit this race,
> 

I'm thinking:
P: Process that "generates" notification
S: Supervisor
U: User

P: Generated notification
S: ioctl(RECV...) // With wait_killable flag.
...complete is called in the supervisor, but the P may not be woken up...
U: kill -SIGTERM $P
...signal gets delivered to p and causes wakeup and
wait_for_completion_interruptible returns 1...

Then you need to check the race
> > @@ -1457,6 +1487,12 @@ static long seccomp_notify_recv(struct seccomp_filter *filter,
> >  	unotif.pid = task_pid_vnr(knotif->task);
> >  	unotif.data = *(knotif->data);
> >  
> > +	if (unotif.flags & SECCOMP_USER_NOTIF_FLAG_WAIT_KILLABLE) {
> > +		knotif->wait_killable = true;
> > +		complete(&knotif->ready);
> > +	}
> > +
> > +
> >  	knotif->state = SECCOMP_NOTIFY_SENT;
> >  	wake_up_poll(&filter->wqh, EPOLLOUT | EPOLLWRNORM);
> >  	ret = 0;
> 
> Seems like the idea is that if someone does a ioctl(RECV, ...) twice
> they'll hit it? But doesn't the test for NOTIFY_INIT and return
> -ENOENT above this hunk prevent that?
> 
> Thanks,
> 
> Tycho


More information about the Containers mailing list