[PATCH v5 3/3] cgroups: make procs file writable
kamezawa.hiroyu at jp.fujitsu.com
Mon Jan 3 16:57:16 PST 2011
On Mon, 27 Dec 2010 05:12:28 -0500
Ben Blum <bblum at andrew.cmu.edu> wrote:
> > > > About nodemask, I feel what's required is pre_pre_attach() to cache required memory
> > > > before attaching as radix_tree_preload(). Then, subsys can prepare for creating
> > > > working area.
> > >
> > > To save on global memory footprint, we could pre-allocate two nodemasks,
> > > but I feel like it's not worth the increase in code complexity. This
> > > would need to be done in the other cases that unsafely do NODEMASK_ALLOC
> > > too... too much to keep track of for little gain.
> > >
> > But NODEMASK_ALLOC cannot be on stack when we consider 4096node systems.
> Hence global allocation, instead of on-stack. (Also, for this particular
> case, the state of the nodemasks needs to persist from pre_attach to
> attach_task to attach, so it can't be static inside the function
Yes, it can be global/static.
> > > > Hmm...but I wonder de_thread() should take threadgroup_fork_write_unlock().
> > > >
> > > > I may not understand anything important but I feel taking tasklist_lock() is overkill.
> > >
> > > Would rcu_read_lock() be any better? Everywhere else in the kernel that
> > > iterates over all threads in a group uses either rcu_read_lock or
> > > tasklist_lock.
> > >
> > Because it means that can_attach()/attach() cannot sleep, it seems to make no
> > difference.
> can_attach_task(), pre_attach(), and attach_task() cannot sleep, but
> can_attach() and attach() may. Careful not to confuse them. :P
> > I wonder.... if you stops all clone()/fork() of the proc in moving, you can use
> > find_ge_pid(). Please see next_tgid() or next_tid() in fs/proc/base.c which
> > implements scanning tasklist with sleep. Can't you use next_tid() ?
> If I'm not mistaken, that approach is vulnerable to an exit() race -
> next_tid() may return NULL if pid_alive() fails, and then we stop
> iterating and miss some threads. (The relevant code is kernel/exit.c,
> __unhash_process, which is protected by tasklist_lock and sighand->lock,
> but nothing else.)
You can use pid array, find_ge_pid() (or some), if the linked-list
is not dependable. If CLONE_THREAD is blocked, it works to catch
all threads without races.
More information about the Containers