[PATCH 1/1] seccomp: Always "goto wait" if the list is empty

Christian Brauner christian.brauner at ubuntu.com
Tue Apr 13 17:53:55 UTC 2021


On Tue, Apr 13, 2021 at 06:01:51PM +0200, Rodrigo Campos wrote:
> It is possible for the thread with the seccomp filter attached (target)
> to be waken up by an addfd message, but the list be empty. This happens
> when the addfd ioctl on the other side (seccomp agent) is interrupted by
> a signal such as SIGURG. In that case, the target erroneously and
> prematurely returns from the syscall to userspace even though the
> seccomp agent didn't ask for it.
> 
> This happens in the following scenario:
> 
> seccomp_notify_addfd()                                           | seccomp_do_user_notification()
>                                                                  |
>                                                                  |  err = wait_for_completion_interruptible(&n.ready);
>  complete(&knotif->ready);                                       |
>  ret = wait_for_completion_interruptible(&kaddfd.completion);    |
>  // interrupted                                                  |
>                                                                  |
>  mutex_lock(&filter->notify_lock);                               |
>  list_del(&kaddfd.list);                                         |
>  mutex_unlock(&filter->notify_lock);                             |
>                                                                  |  mutex_lock(&match->notify_lock);
>                                                                  |  // This is false, addfd is false
>                                                                  |  if (addfd && n.state != SECCOMP_NOTIFY_REPLIED)
>                                                                  |
>                                                                  |  ret = n.val;
>                                                                  |  err = n.error;
>                                                                  |  flags = n.flags;
> 
> So, the process blocked in seccomp_do_user_notification() will see a
> response. As n is 0 initialized and wasn't set, it will see a 0 as
> return value from the syscall.
> 
> The seccomp agent, when retrying the interrupted syscall, will see an
> ENOENT error as the notification no longer exists (it was already
> answered by this bug).
> 
> This patch fixes the issue by splitting the if in two parts: if we were
> woken up and the state is not replied, we will always do a "goto wait".
> And if that happens and there is an addfd element on the list, we will
> add the fd before "goto wait".
> 
> This issue is present since 5.9, when addfd was added.
> 
> Fixes: 7cf97b1254550
> Cc: stable at vger.kernel.org # 5.9+
> Signed-off-by: Rodrigo Campos <rodrigo at kinvolk.io>
> ---

So the agent will see the return value from
wait_for_completion_interruptible() and know that the addfd wasn't
successful and the target will notice that no addfd request has actually
been added and essentially try again. Seems like a decent fix and can be
backported cleanly. I assume seccomp testsuite passes.

Acked-by: Christian Brauner <christian.brauner at ubuntu.com>


More information about the Containers mailing list