[PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()

Thu Sep 1 05:58:24 PDT 2011

On Thu, Sep 01, 2011 at 08:22:21PM +0900, Tejun Heo wrote:
> Hello,
> 
> On Wed, Aug 31, 2011 at 03:42:24PM +0200, Frederic Weisbecker wrote:
> > My task counter subsystem patchset brings a cancel_attach_task() callback
> > that cancels can_attach_task() effects.
> > 
> > I thought that rebased on top of your patch it's going to be merged inside
> > cancel_attach() but OTOH we can't cancel the effect of failed migration
> > on a thread that way.
> > 
> > May be we need to keep a cancel_attach_task() just for that purpose?
> 
> We can do that but I think that becomes a bit too complex and fragile.
> That path won't be traveled unless it races against exit.  Bugs will
> be difficult to detect and reproduce.  In this respect, the current
> code already seems racy.  ->can_attach (or other methods in the attach
> path) and ->exit can race each other and I don't think all subsystems
> handle that properly.

I guess subsystems don't care about that currently, although I haven't
checked. But the task counter will need this per thread cancellation in
migration failure, without the need for any synchronization between
can_attach, attach and exit.

Now if we want to solve this with a synchronization there, either
we task_lock() every tasks in the group in the beginning of cgroup_attach_proc().
But that's not very nice. Or we use cgroup_mutex on cgroup_exit() exit,
but that's even worse.

I guess we need the leader->sighand->siglock on both cgroup_attach_proc()
and cgroup_exit(). Besides we may have more reasons to have that:

https://lkml.org/lkml/2011/8/15/342

> 
> IMHO the right thing to do here is simplifying synchronization rules
> so that nothing else happens while migration is in progress.
> 
> Thanks.
> 
> -- 
> tejun