setns vs unshare bug
Pavel Emelyanov
xemul at parallels.com
Fri Aug 10 14:55:36 UTC 2012
Hi, Eric!
There's an issue with setns versus unshare syscall which I consider
to be worth looking at. Look -- when you open some task's namespace file,
e.g. /proc/<pid>/ns/net, the net namespace is cached on the proc inode.
If later the task with the pid <pid> unshares the namespace in question
(in this case -- net ns) the subsequent openings of this task's proc ns
file will result in old namespace obtained and the setns call will not
work as expected. Here's a simple proggie which demonstrates this:
int main(void)
{
int pid, fd;
char path[64];
pid = fork();
if (!pid) {
fd = open("/proc/self/ns/net", O_RDONLY);
close(fd);
unshare(CLONE_NEWNET);
printf("New net:\n");
system("ip l");
sleep(1);
} else {
sleep(1);
printf("Old net:\n");
system("ip l");
sprintf(path, "/proc/%d/ns/net", pid);
fd = open(path, O_RDONLY);
set_ns(fd, CLONE_NEWNET);
printf("New net 2:\n");
system("ip l");
}
return 0;
}
The "else" branch after set_ns expects the net it set to be the new one (and
contain a lo device only), but it's not so -- after the setns syscall the net
namespace isn't changed! If you comment out the "if" branch's open and close
calls (thus avoiding the ns caching) the setns works as expected.
I assume you're aware of this problem, so do you have plans to fix this?
Thanks,
Pavel
More information about the Containers
mailing list