[PATCH] nsproxy: attach to namespaces via pidfds

Christian Brauner christian.brauner at ubuntu.com
Mon Apr 27 16:11:47 UTC 2020


On Mon, Apr 27, 2020 at 10:21:55AM -0500, Eric W. Biederman wrote:
> 
> I am still catching up on the what exists for pidfd.  Do you have a way
> to safely go from a pidfd to the corresponding proc directory?

Yep, that's possible. The pidfd's fdinfo file contains the same format
for the Pid: and NSpid: fields as /proc/<pid>/status. Here's e.g. what
systemd is doing currently:

 int pidfd_get_pid(int fd, pid_t *ret) {
        char path[STRLEN("/proc/self/fdinfo/") + DECIMAL_STR_MAX(int)];
        _cleanup_free_ char *fdinfo = NULL;
        char *p;
        int r;

        if (fd < 0)
                return -EBADF;

        xsprintf(path, "/proc/self/fdinfo/%i", fd);

        r = read_full_file(path, &fdinfo, NULL);
        if (r == -ENOENT) /* if fdinfo doesn't exist we assume the process does not exist */
                return -ESRCH;
        if (r < 0)
                return r;

        p = startswith(fdinfo, "Pid:");
        if (!p) {
                p = strstr(fdinfo, "\nPid:");
                if (!p)
                        return -ENOTTY; /* not a pidfd? */

                p += 5;
        }

        p += strspn(p, WHITESPACE);
        p[strcspn(p, WHITESPACE)] = 0;

        return parse_pid(p, ret);
}

> 
> That would make this setns work just an optimization.  A nice one but
> just an optimization.

Hm, I tried to describe how it's not just a worthwhile optimization
because it gets the number of syscalls down from 14 to a single syscall
which is kinda excellent for something like attach/exec into a container
which is a fairly common operation but it also gives us a couple of
other nice properties such as atomic attach and appearing in all
namespace at the same time similar to clone with all namespace flags
set.

Thanks!
Christian


More information about the Containers mailing list