[PATCH 0/2] namespaces: log namespaces per task

Mon May 5 21:29:05 UTC 2014

On 14/05/02, Serge Hallyn wrote:
> Quoting Richard Guy Briggs (rgb at redhat.com):
> > On 14/05/02, Serge E. Hallyn wrote:
> > > Quoting Richard Guy Briggs (rgb at redhat.com):
> > > > I saw no replies to my questions when I replied a year after Aris' posting, so
> > > > I don't know if it was ignored or got lost in stale threads:
> > > >         https://www.redhat.com/archives/linux-audit/2013-March/msg00020.html
> > > >         https://www.redhat.com/archives/linux-audit/2013-March/msg00033.html
> > > > 	(https://lists.linux-foundation.org/pipermail/containers/2013-March/032063.html)
> > > >         https://www.redhat.com/archives/linux-audit/2014-January/msg00180.html
> > > > 
> > > > I've tried to answer a number of questions that were raised in that thread.
> > > > 
> > > > The goal is not quite identical to Aris' patchset.
> > > > 
> > > > The purpose is to track namespaces in use by logged processes from the
> > > > perspective of init_*_ns.  The first patch defines a function to list them.
> > > > The second patch provides an example of usage for audit_log_task_info() which
> > > > is used by syscall audits, among others.  audit_log_task() and
> > > > audit_common_recv_message() would be other potential use cases.
> > > > 
> > > > Use a serial number per namespace (unique across one boot of one kernel)
> > > > instead of the inode number (which is claimed to have had the right to change
> > > > reserved and is not necessarily unique if there is more than one proc fs).  It
> > > > could be argued that the inode numbers have now become a defacto interface and
> > > > can't change now, but I'm proposing this approach to see if this helps address
> > > > some of the objections to the earlier patchset.
> > > > 
> > > > There could also have messages added to track the creation and the destruction
> > > > of namespaces, listing the parent for hierarchical namespaces such as pidns,
> > > > userns, and listing other ids for non-hierarchical namespaces, as well as other
> > > > information to help identify a namespace.
> > > > 
> > > > There has been some progress made for audit in net namespaces and pid
> > > > namespaces since this previous thread.  net namespaces are now served as peers
> > > > by one auditd in the init_net namespace with processes in a non-init_net
> > > > namespace being able to write records if they are in the init_user_ns and have
> > > > CAP_AUDIT_WRITE.  Processes in a non-init_pid_ns can now similarly write
> > > > records.  As for CAP_AUDIT_READ, I just posted a patchset to check capabilities
> > > > of userspace processes that try to join netlink broadcast groups.
> > > > 
> > > > 
> > > > Questions:
> > > > Is there a way to link serial numbers of namespaces involved in migration of a
> > > > container to another kernel?  (I had a brief look at CRIU.)  Is there a unique
> > > > identifier for each running instance of a kernel?  Or at least some identifier
> > > > within the container migration realm?
> > > 
> > > Eric Biederman has always been adamantly opposed to adding new namespaces
> > > of namespaces, so the fact that you're asking this question concerns me.
> > 
> > I have seen that position and I don't fully understand the justification
> > for it other than added complexity.
> > 
> > One way that occured to me to be able to identify a kernel instance was
> > to look at CPU serial numbers or other CPU entity intended to be
> > globally unique, but that isn't universally available.
> 
> That's one issue, which is uniqueness of namespaces cross-machines.
> 
> But it gets worse if we consider that after allowing in-container audit,
> we'll have a nested container running, then have the parent container
> migrated to another host (or just checkpointed and restarted);  Now the
> nexted container's indexes will all be changed.  Is there any way audit
> can track who's who after the migration?

Presumably the namespace serial numbers before and after would be logged
in one message to tie them together.

> That's not an indictment of the serial # approach, since (a) we don't
> have in-container audit yet and (b) we don't have c/r/migration of nested
> containers.  But it's worth considering whether we can solve the issue
> with serial #s, and, if not, whether we can solve it with any other
> approach.
> 
> I guess one approach to solve it would be to allow userspace to request
> a next serial #.  Which will immediately lead us to a namespace of serial
> #s (since the requested # might be lower than the last used one on the
> new host).

:P

> As you've said inode #s for /proc/self/ns/* probably aren't sufficiently
> unique, though perhaps we could attach a generation # for the sake of
> audit.  Then after a c/r/migration the generation # may be different,
> but we may have a better shot at at least using the same ino#.

A generation number is an interesting idea.  Would it get incremented
every time a namespace is c/r/migrated?  Or just if there is a conflict?

Same ino#?  Or same sn?

> > - RGB

- RGB

--
Richard Guy Briggs <rbriggs at redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545