[RFD] Merge task counter into memcg

Johannes Weiner hannes at cmpxchg.org
Thu Apr 12 17:23:09 UTC 2012


On Thu, Apr 12, 2012 at 09:38:25AM -0700, Tejun Heo wrote:
> Hello, Johannes.
> 
> On Thu, Apr 12, 2012 at 05:30:55PM +0200, Johannes Weiner wrote:
> > To reraise a point from my other email that was ignored: do users
> > actually really care about the number of tasks when they want to
> > prevent forkbombs?  If a task would use neither CPU nor memory, you
> > would not be interested in limiting the number of tasks.
> > 
> > Because the number of tasks is not a resource.  CPU and memory are.
> >
> > So again, if we would include the memory impact of tasks properly
> > (structures, kernel stack pages) in the kernel memory counters which
> > we allow to limit, shouldn't this solve our problem?
> 
> The task counter is trying to control the *number* of tasks, which is
> purely memory overhead.  Translating #tasks into the actual amount of
> memory isn't too trivial tho - the task stack isn't the only
> allocation and the numbers should somehow make sense to the userland
> in consistent way.

But why would you ever even care about that number, though?  It has no
intrinsic value.  We used it in a past because we had no other control
over kernel memory and CPU usage.

Even if we start out accounting just the kernel stack (which should be
the biggest chunk), it won't be less accurate than limiting numbers of
tasks.  It's just a different unit, but one which we can account and
limit with less extra code, and even improve as we go along.

[ You could have tuned your task counter limit perfectly to one kernel
  version, the next version will have changed the memory required per
  task, file, random object, and suddenly your working setup runs out
  of memory.  So it's not like starting with kernel stack and adding
  more stuff later would be any less predictable. ]

I don't think anyone wants to come back in a few months and discuss
where the nr-of-open-files counter subsystem should live.

> Also, I'm not sure whether this particular limit should live in its
> silo or should be summed up together as part of kmem (kmem itself is
> in its own silo after all apart from user memory, right?).

There is k and u+k.  I don't see a technical problem with adding a
separate stat for it later, but also not a particular reason to treat
it differently, because it's nothing special.  It's just kernel
memory.  Do you care if your cgroup has 2M tasks with one open socket
each or one task with 2M sockets, as long as the group plays along
nicely with the others?


More information about the Containers mailing list