[PATCH 6/6] Makes procs file writable to move all threads by tgid at once

Paul Menage menage at google.com
Mon Aug 3 11:07:54 PDT 2009

On Mon, Aug 3, 2009 at 10:54 AM, Serge E. Hallyn<serue at us.ibm.com> wrote:
>> +     if (guarantee) {
>> +             /*
>> +              * our caller promises us that the css_set we want already
>> +              * exists, so we use find_existing_css_set directly.
>> +              */
>> +             struct cgroup_subsys_state *template[CGROUP_SUBSYS_COUNT];
>> +             read_lock(&css_set_lock);
>> +             newcg = find_existing_css_set(oldcg, cgrp, template);
>> +             BUG_ON(!newcg);
>> +             get_css_set(newcg);
>> +             read_unlock(&css_set_lock);
>> +     } else {
>> +             might_sleep();
> So cgroup_task_migrate() might sleep, but

might sleep if guarantee==0

>> +             retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, 1);
> Here it is called under rcu_read_lock().

With guarantee==1

> What *exactly* is it we are protecting with cgroup_fork_mutex?
> 'fork' (as the name implies) is not a good answer, since we should be
> protecting data, not code.  If it is solely tsk->cgroups, then perhaps
> we should in fact try switching to (s?)rcu.  Then cgroup_fork() could
> just do rcu_read_lock, while cgroup_task_migrate() would make the change
> under a spinlock (to protect against concurrent cgroup_task_migrate()s),
> and using rcu_assign_pointer to let cgroup_fork() see consistent data
> either before or after the update...  That might mean that any checks done
> before completing the migrate which involve the # of tasks might become
> invalidated before the migration completes?

What's being protected is the ability to move an entire thread group
to the new destination cgroup, even in the presence of concurrent
thread clone operations. New threads aren't visible to other threads
until the point when they're attached to the tasklist, so if any
concurrent do_fork() operation is somewhere between the call to
cgroup_fork() and the attachment to the tasklist when the "attach proc
to new cgroup" operation occurs, it may get left behind.


More information about the Containers mailing list