[PATCH 5/6] Makes procs file writable to move all threads by tgid at once

Benjamin Blum bblum at google.com
Fri Jul 24 13:53:53 PDT 2009

On Fri, Jul 24, 2009 at 1:47 PM, Paul Menage<menage at google.com> wrote:
> On Fri, Jul 24, 2009 at 10:23 AM, Matt Helsley<matthltc at us.ibm.com> wrote:
>> Well, I imagine holding tasklist_lock is worse than cgroup_mutex in some
>> ways since it's used even more widely. Makes sense not to use it here..
> Just to clarify - the new "procs" code doesn't use cgroup_mutex for
> its critical section, it uses a new cgroup_fork_mutex, which is only
> taken for write during cgroup_proc_attach() (after all setup has been
> done, to ensure that no new threads are created while we're updating
> all the existing threads). So in general there'll be zero contention
> on this lock - the cost will be the cache misses due to the rwlock
> bouncing between the different CPUs that are taking it in read mode.

Right. The different options so far are:

Global rwsem: only needs one lock, but prevents all forking when a
write is in progress. It should be quick enough, if it's just "iterate
down the threadgroup list in O(n)". In the good case, fork() slows
down by a cache miss when taking the lock in read mode.
Threadgroup-local rwsem: Needs adding a field to task_struct. Only
forks within the same threadgroup would block on a write to the procs
file, and the zero-contention case is the same as before.
Using tasklist_lock: Currently, the call to cgroup_fork() (which
starts the race) is very far above where tasklist_lock is taken in
fork, so taking tasklist_lock earlier is very infeasible. Could
cgroup_fork() be moved downwards to inside it, and if so, how much
restructuring would be needed? Even if so, this still adds stuff that
is being done (unnecessarily) while holding a global mutex.

> What happened to the big-reader lock concept from 2.4.x? That would be
> applicable here - minimizing the overhead on the critical path when
> the write operation is expected to be very rare.

Seems like a good application, but it appears to be gone in the
current kernel. Also, from my understanding, it would have to be a
global (or at least not threadgroup-local) lock, no? Were we to use
this and try to write to the procs file while a bunch of forks are in
progress, how long would the write operation have to block? (that is,
at least with a rwsem, the writing thread seems to get the lock rather
quickly when there's contention.) Depending on just how slow
write-locking one of these is, it might kill any hopes of performing a
write while forks are in progress.

More information about the Containers mailing list