[PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns
Serge E. Hallyn
serge at hallyn.com
Mon Sep 29 14:00:10 UTC 2014
Quoting Chen, Hanxiao (chenhanxiao at cn.fujitsu.com):
> > -----Original Message-----
> > From: containers-bounces at lists.linux-foundation.org
> > [mailto:containers-bounces at lists.linux-foundation.org] On Behalf Of Chen
> > Hanxiao
> > Sent: Wednesday, September 24, 2014 6:00 PM
> > To: containers at lists.linux-foundation.org; linux-kernel at vger.kernel.org
> > Cc: Richard Weinberger; Serge Hallyn; Oleg Nesterov; Mateusz Guzik; David Howells;
> > Eric W. Biederman
> > Subject: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns
> > If some issues occurred inside a container guest, host user
> > could not know which process is in trouble just by guest pid:
> > the users of container guest only knew the pid inside containers.
> > This will bring obstacle for trouble shooting.
> > This patch adds four fields: NStgid, NSpid, NSpgid and NSsid:
> > a) In init_pid_ns, nothing changed;
> > b) In one pidns, will tell the pid inside containers:
> > NStgid: 21776 5 1
> > NSpid: 21776 5 1
> > NSpgid: 21776 5 1
> > NSsid: 21729 1 0
> > ** Process id is 21776 in level 0, 5 in level 1, 1 in level 2.
> > c) If pidns is nested, it depends on which pidns are you in.
> > NStgid: 5 1
> > NSpid: 5 1
> > NSpgid: 5 1
> > NSsid: 1 0
> > ** Views from level 1
> This patch is simple, useful and safe.
> But currently there is not any feedbacks.
> Any comments or ideas?
Thanks, Chen. The code looks fine. My concern is that you are
exposing information which cannot be checkpointed and restarted.
In particular, if I'm inside a nested container, so I'm in pidns
level 3, then my own NSpid info, when I read it, will show the
pids at parent namespaces. If I'm restarted at the third pidns
level, only the one pid can be restored.
Now it may be fair to say "this is proc, and proc and sys show
host info which is not containerized and cannot be checkpointed
and restarted, deal with it." But I'm not sure.
There are two ways you could deal with this. One would be to
show the nspids only to the level of the reader of the file - but
I don't think you need to do that. I think you're better off
simply showing the pids up to the level of the struct pid for
the mounter of the procfs. So if I'm inside container c2 which
is inside container c1, my own /proc will only show pids which
are valid in c2 (and any child namespaces), while the /proc
mounted in c1 will show pids valid in c1 and c2 (and any children),
but not those in the init_pid_ns. It's then just up to the
container administrators to make sure that c2 cannot see c1's
/proc to confuse itself and confuddle checkpoint-restart
More information about the Containers