For review: seccomp_user_notif(2) manual page [v2]

Fri Oct 30 19:14:37 UTC 2020

On Thu, Oct 29, 2020 at 3:19 PM Michael Kerrisk (man-pages)
<mtk.manpages at gmail.com> wrote:
> On 10/29/20 2:42 AM, Jann Horn wrote:
> > On Mon, Oct 26, 2020 at 10:55 AM Michael Kerrisk (man-pages)
> > <mtk.manpages at gmail.com> wrote:
> >>        static bool
> >>        getTargetPathname(struct seccomp_notif *req, int notifyFd,
> >>                          char *path, size_t len)
> >>        {
> >>            char procMemPath[PATH_MAX];
> >>
> >>            snprintf(procMemPath, sizeof(procMemPath), "/proc/%d/mem", req->pid);
> >>
> >>            int procMemFd = open(procMemPath, O_RDONLY);
> >>            if (procMemFd == -1)
> >>                errExit("\tS: open");
> >>
> >>            /* Check that the process whose info we are accessing is still alive.
> >>               If the SECCOMP_IOCTL_NOTIF_ID_VALID operation (performed
> >>               in checkNotificationIdIsValid()) succeeds, we know that the
> >>               /proc/PID/mem file descriptor that we opened corresponds to the
> >>               process for which we received a notification. If that process
> >>               subsequently terminates, then read() on that file descriptor
> >>               will return 0 (EOF). */
> >>
> >>            checkNotificationIdIsValid(notifyFd, req->id);
> >>
> >>            /* Read bytes at the location containing the pathname argument
> >>               (i.e., the first argument) of the mkdir(2) call */
> >>
> >>            ssize_t nread = pread(procMemFd, path, len, req->data.args[0]);
> >>            if (nread == -1)
> >>                errExit("pread");
> >
> > As discussed at
> > <https://lore.kernel.org/r/CAG48ez0m4Y24ZBZCh+Tf4ORMm9_q4n7VOzpGjwGF7_Fe8EQH=Q@mail.gmail.com>,
> > we need to re-check checkNotificationIdIsValid() after reading remote
> > memory but before using the read value in any way. Otherwise, the
> > syscall could in the meantime get interrupted by a signal handler, the
> > signal handler could return, and then the function that performed the
> > syscall could free() allocations or return (thereby freeing buffers on
> > the stack).
> >
> > In essence, this pread() is (unavoidably) a potential use-after-free
> > read; and to make that not have any security impact, we need to check
> > whether UAF read occurred before using the read value. This should
> > probably be called out elsewhere in the manpage, too...
>
> Thanks very much for pointing me at this!
>
> So, I want to conform that the fix to the code is as simple as
> adding a check following the pread() call, something like:
>
> [[
>      ssize_t nread = pread(procMemFd, path, len, req->data.args[argNum]);
>      if (nread == -1)
>         errExit("Supervisor: pread");
>
>      if (nread == 0) {
>         fprintf(stderr, "\tS: pread() of /proc/PID/mem "
>                 "returned 0 (EOF)\n");
>         exit(EXIT_FAILURE);
>      }
>
>      if (close(procMemFd) == -1)
>         errExit("Supervisor: close-/proc/PID/mem");
>
> +    /* Once again check that the notification ID is still valid. The
> +       case we are particularly concerned about here is that just
> +       before we fetched the pathname, the target's blocked system
> +       call was interrupted by a signal handler, and after the handler
> +       returned, the target carried on execution (past the interrupted
> +       system call). In that case, we have no guarantees about what we
> +       are reading, since the target's memory may have been arbitrarily
> +       changed by subsequent operations. */
> +
> +    if (!notificationIdIsValid(notifyFd, req->id, "post-open"))
> +        return false;
> +
>      /* We have no guarantees about what was in the memory of the target
>         process. We therefore treat the buffer returned by pread() as
>         untrusted input. The buffer should be terminated by a null byte;
>         if not, then we will trigger an error for the target process. */
>
>      if (strnlen(path, nread) < nread)
>          return true;
> ]]

Yeah, that should do the job. With the caveat that a cancelled syscall
could've also led to the memory being munmap()ed, so the nread==0 case
could also happen legitimately - so you might want to move this check
up above the nread==0 (mm went away) and nread==-1 (mm still exists,
but read from address failed, errno EIO) checks if the error message
shouldn't appear spuriously.