[PATCH v5 3/3] cgroups: make procs file writable

KAMEZAWA Hiroyuki kamezawa.hiroyu at jp.fujitsu.com
Mon Jan 3 16:57:16 PST 2011


On Mon, 27 Dec 2010 05:12:28 -0500
Ben Blum <bblum at andrew.cmu.edu> wrote:

> > > > About nodemask, I feel what's required is pre_pre_attach() to cache required memory
> > > > before attaching as radix_tree_preload(). Then, subsys can prepare for creating
> > > > working area.
> > > 
> > > To save on global memory footprint, we could pre-allocate two nodemasks,
> > > but I feel like it's not worth the increase in code complexity. This
> > > would need to be done in the other cases that unsafely do NODEMASK_ALLOC
> > > too... too much to keep track of for little gain.
> > > 
> > 
> > But NODEMASK_ALLOC cannot be on stack when we consider 4096node systems.
> 
> Hence global allocation, instead of on-stack. (Also, for this particular
> case, the state of the nodemasks needs to persist from pre_attach to
> attach_task to attach, so it can't be static inside the function
> either.)
> 

Yes, it can be global/static.



> > > > Hmm...but I wonder de_thread() should take threadgroup_fork_write_unlock().
> > > > 
> > > > I may not understand anything important but I feel taking tasklist_lock() is overkill.
> > > 
> > > Would rcu_read_lock() be any better? Everywhere else in the kernel that
> > > iterates over all threads in a group uses either rcu_read_lock or
> > > tasklist_lock.
> > > 
> > 
> > Because it means that can_attach()/attach() cannot sleep, it seems to make no
> > difference.
> 
> can_attach_task(), pre_attach(), and attach_task() cannot sleep, but
> can_attach() and attach() may. Careful not to confuse them. :P
> 
> > I wonder.... if you stops all clone()/fork() of the proc in moving, you can use
> > find_ge_pid(). Please see next_tgid() or next_tid() in fs/proc/base.c which
> > implements  scanning tasklist with sleep. Can't you use next_tid() ?
> 
> If I'm not mistaken, that approach is vulnerable to an exit() race -
> next_tid() may return NULL if pid_alive() fails, and then we stop
> iterating and miss some threads. (The relevant code is kernel/exit.c,
> __unhash_process, which is protected by tasklist_lock and sighand->lock,
> but nothing else.)
>

You can use pid array, find_ge_pid() (or some), if the linked-list
is not dependable. If CLONE_THREAD is blocked, it works to catch
all threads without races.

Thanks,
-Kame



More information about the Containers mailing list