For review: seccomp_user_notif(2) manual page

Jann Horn jannh at google.com
Thu Oct 29 04:26:05 UTC 2020


On Thu, Oct 29, 2020 at 3:13 AM Tycho Andersen <tycho at tycho.pizza> wrote:
> > > Consider the following scenario (with supervisor "S" and target "T"; S
> > > wants to wait for events on two file descriptors seccomp_fd and
> > > other_fd):
> > >
> > > S: starts poll() to wait for events on seccomp_fd and other_fd
> > > T: performs a syscall that's filtered with RET_USER_NOTIF
> > > S: poll() returns and signals readiness of seccomp_fd
> > > T: receives signal SIGUSR1
> > > T: syscall aborts, enters signal handler
> > > T: signal handler blocks on unfiltered syscall (e.g. write())
> > > S: starts SECCOMP_IOCTL_NOTIF_RECV
> > > S: blocks because no syscalls are pending
> > >
> > > Depending on what other_fd is, this could in a worst case even lead to
> > > a deadlock (if e.g. the signal handler wants to write to stdout, but
> > > the stdout fd is hooked up to other_fd in the supervisor, but the
> > > supervisor can't consume the data written because it's stuck in
> > > seccomp handling).
> > >
> > > So we have to ensure that when existing code (like that crun code you
> > > linked to) triggers this case, SECCOMP_IOCTL_NOTIF_RECV returns
> > > immediately instead of blocking.
> >
> > Or I guess we could also just set O_NONBLOCK on the fd by default?
> > Since the one existing user is eventloop-based...
>
> I feel like it's ok to return an error from the RECV ioctl() if
> there's never going to be any more events on the fd; was there
> something fundamentally wrong with your patch here:
> https://lore.kernel.org/bpf/CAG48ez2xn+_KznEztJ-eVTsTzkbf9CVgPqaAk7TpRNAqbdaRoA@mail.gmail.com/
> ?

No, I have a new version of that about 80% done and hope to send it
out soonish. (There's some stuff around tests that I still need to
cobble together).


More information about the Containers mailing list