memrlimit controller merge to mainline

Hugh Dickins hugh at veritas.com
Mon Aug 4 14:52:36 PDT 2008


On Tue, 5 Aug 2008, Balbir Singh wrote:
> Hugh Dickins wrote:
> [snip]
> > 
> > BUG: unable to handle kernel paging request at 6b6b6b8b
> > IP: [<7817078f>] memrlimit_cgroup_uncharge_as+0x18/0x29
> > Pid: 22500, comm: swapoff Not tainted (2.6.26-rc8-mm1 #7)
> >  [<78161323>] ? exit_mmap+0xaf/0x133
> >  [<781226b1>] ? mmput+0x4c/0xba
> >  [<78165ce3>] ? try_to_unuse+0x20b/0x3f5
> >  [<78371534>] ? _spin_unlock+0x22/0x3c
> >  [<7816636a>] ? sys_swapoff+0x17b/0x37c
> >  [<78102d95>] ? sysenter_past_esp+0x6a/0xa5
> 
> I am unable to reproduce the problem,

Me neither, I've spent many hours trying 2.6.27-rc1-mm1 and then
back to 2.6.26-rc8-mm1.  But I've been SO stupid: saw it originally
on one machine with SLAB_DEBUG=y, have been trying since mostly on
another with SLUB_DEBUG=y, but never thought to boot with
slub_debug=P,task_struct until now.

> but I do have an initial hypothesis
> 
> CPU0					CPU1
> 					try_to_unuse
> task 1 stars exiting			look at mm = task1->mm
> ..					increment mm_users
> task 1 exits
> mm->owner needs to be updated, but
> no new owner is found
> (mm_users > 1, but no other task
> has task->mm = task1->mm)
> mm_update_next_owner() leaves
> 
> grace period
> 					user count drops, call mmput(mm)
> task 1 freed
> 					dereferencing mm->owner fails

Yes, that looks right to me: seems obvious now.  I don't think your
careful alternation of CPU0/1 events at the end matters: the swapoff
CPU simply dereferences mm->owner after that task has gone.

(That's a shame, I'd always hoped that mm->owner->comm was going to
be good for use in mm messages, even when tearing down the mm.)

> I do have a potential solution in mind, but I want to make sure my
> hypothesis is correct.

It seems wrong that memrlimit_cgroup_uncharge_as should be called
after mm->owner may have been changed, even if it's to something safe.
But I forget the mm/task exit details, surely they're tricky.

By the way, is the ordering in mm_update_next_owner the best?
Would there be less movement if it searched amongst siblings before
it searched amongst children?  Ought it to make a first pass trying
to stay within the same cgroup?

Hugh


More information about the Containers mailing list