[PATCH 1/1] Syslog are now containerized

Matt Helsley matthltc at us.ibm.com
Sat Feb 13 10:11:58 PST 2010


On Thu, Feb 11, 2010 at 11:48:43AM -0600, Serge E. Hallyn wrote:
> Quoting Jean-Marc Pigeon (jmp at safe.ca):
> > 	Added syslog.c such container /proc/kmsg and host /proc/kmsg
> > 	do not leak in each other.
> > 	Running rsyslog daemon within a container won't destroy
> > 	host kernel messages.
> 
> Thanks Jean-Marc.  But this really isn't doing most of what I'd
> recommended in my last emails (both public and private.  In
> particular:
> 
> > index cc4f453..3d0c73e 100644
> > --- a/include/linux/user_namespace.h
> > +++ b/include/linux/user_namespace.h
> > @@ -14,6 +14,7 @@ struct user_namespace {
> >  	struct hlist_head	uidhash_table[UIDHASH_SZ];
> >  	struct user_struct	*creator;
> >  	struct work_struct	destroyer;
> > +	struct syslog_ns        *syslog;
> 
> syslog_ns should be moved into nsproxy and unshared with a
> separate clone(CLONE_SYSLOG);

I agree. Keeping it in the user_namespace is strange. It should
be in the nsproxy along with all of the other namespace pointers.

> 
> > -static void emit_log_char(char c)
> > +static void emit_log_char(struct syslog_ns *syslog_ns, char c)
> >  {
> > -	LOG_BUF(log_end) = c;
> > -	log_end++;
> > -	if (log_end - log_start > log_buf_len)
> > -		log_start = log_end - log_buf_len;
> > -	if (log_end - con_start > log_buf_len)
> > -		con_start = log_end - log_buf_len;
> > -	if (logged_chars < log_buf_len)
> > -		logged_chars++;
> > +	LOG_BUF(syslog_ns, sys_log_end) = c;
> 
> Taking syslog_ns from current in emit_log_char is not right.
> Emit_log_char() is called from printk which (the comment above
> printk warns us) can be called from any context.
> 
> That was why I suggested:
> 
> >! I think my patch is fundamentally wrong anyway:  we should not
> >! use current's context at vprintk like i did.  We should use an
> >! optional passed-in context from those callsites where we want to,
> >! and default to init otherwise.  That means
> >! 
> >! 1. put the core of vprintk() into vnsprintk() which takes a syslog
> >! namespace as argument
> >! 
> >! 2. make vprintk() a wrapper around vnsprintk() which passes in
> >! init_syslog_ns to vnsprintk().  printk can remain unchanged.
> >! 
> >! 4. make nsprintk() a wrapper around vnsprintk() which takes a syslog
> >! ns argument and pass it to vnsprintk()
> >! 
> >! 3. do_syslog() can obviously be containerized the same way it
> >! is now.
> >! 
> >! 4. take a printk call like the iptables ones you want and turn
> >! int into nsprintk syscall.

Definitely. Actually, I think we can go one step further and say that
passing syslog_ns directly to nsprintk() is wrong too. It's wrong
because we only know the namespace that's relevant to the printk
contents -- and that won't usually be a syslog_ns.

Better to have the "find corresponding syslog_ns" logic inside
nsprintk(). The great thing about doing it this way is if we
ever decide to duplicate printk output to multiple containers it
can be done inside nsprintk rather than doing a flag-day patch.

Cheers,
	-Matt Helsley


More information about the Containers mailing list