[PATCH] c/r: fix race of prepare_descendant() with an ongoing fork()

Serge E. Hallyn serue at us.ibm.com
Tue Sep 22 14:46:19 PDT 2009


Quoting Serge E. Hallyn (serue at us.ibm.com):
> Quoting Oren Laadan (orenl at librato.com):
> > From: Oren Laadan <orenl at librato.edu>
> > 
> > If prepare_descendants() is walking a tree and one of the tasks is
> > forking, one of two bads can happen. If the child doesn't inherit the
> > ->ctx, it breaks the assumption that the entire subtree is prepared.
> > If the child inherits the ->ctx, it will have one without having taken
> > a reference.
> > 
> > This patch closed this race by explicitly getting and referencing the
> > ->ctx for a child process should the parent have one, atomically under
> > the tasklist_lock.
> > 
> > Signed-off-by: Oren Laadan <orenl at cs.columbia.edu>
> > ---
> >  kernel/fork.c |   11 ++++++++---
> >  1 files changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 9f13d7b..57118e4 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -62,6 +62,7 @@
> >  #include <linux/fs_struct.h>
> >  #include <linux/magic.h>
> >  #include <linux/perf_counter.h>
> > +#include <linux/checkpoint.h>
> > 
> >  #include <asm/pgtable.h>
> >  #include <asm/pgalloc.h>
> > @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> >  	INIT_LIST_HEAD(&p->pi_state_list);
> >  	p->pi_state_cache = NULL;
> >  #endif
> > -#ifdef CONFIG_CHECKPOINT
> > -	p->checkpoint_ctx = NULL;
> > -#endif
> >  	/*
> >  	 * sigaltstack should be cleared when sharing the same VM
> >  	 */
> > @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> >  	/* Need tasklist lock for parent etc handling! */
> >  	write_lock_irq(&tasklist_lock);
> > 
> > +#ifdef CONFIG_CHECKPOINT
> > +	/* If parent is restarting, child should be too */
> > +	if (unlikely(current->checkpoint_ctx)) {
> > +		p->checkpoint_ctx = current->checkpoint_ctx;
> 
> Won't break anything, but technically p->checkpoint_ctx will
> already be copied from current->checkpoint_ctx, so only the
> ckpt_ctx_get() is necessary, so this could really read
> 
> 	if (p->checkpoint_ctx)
> 		ckpt_ctx_get(p->checkpoint_ctx);
> 
> Right?

BTW since all I'm doing is nit-picking, I obviously agree with
the patch on the whole :)

There is no way for the task to be forked into a traced state
(without the parent being traced) right?  And, is the fact that
ctx->nr_total may end up less than the total number of active
tasks a problem at all?

thanks,
-serge


More information about the Containers mailing list