[PATCH 00/10] cgroups: Task counter subsystem v8
thockin at hockin.org
Mon Apr 1 21:02:06 UTC 2013
On Mon, Apr 1, 2013 at 1:29 PM, Tejun Heo <tj at kernel.org> wrote:
> On Mon, Apr 01, 2013 at 01:09:09PM -0700, Tim Hockin wrote:
>> Pardon my ignorance, but... what? Use kernel memory limits as a proxy
>> for process/thread counts? That sounds terrible - I hope I am
> Well, the argument was that process / thread counts were a poor and
> unnecessary proxy for kernel memory consumption limit. IIRC, Johannes
> put it as (I'm paraphrasing) "you can't go to Fry's and buy 4k thread
> worth of component".
>> misunderstanding? This task counter patch had several properties that
>> mapped very well to what we want.
>> Is it dead in the water?
> After some discussion, Frederic agreed that at least his use case can
> be served well by kmemcg, maybe even better - IIRC it was container
> fork bomb scenario, so you'll have to argue your way in why kmemcg
> isn't a suitable solution for your use case if you wanna revive this.
We run dozens of jobs from dozens users on a single machine. We
regularly experience users who leak threads, running into the tens of
thousands. We are unable to raise the PID_MAX significantly due to
some bad, but really thoroughly baked-in decisions that were made a
long time ago. What we experience on a daily basis is users
complaining about getting a "pthread_create(): resource unavailable"
error because someone on the machine has leaked.
Today we use RLIMIT_NPROC to lock most users down to a smaller max.
But this is a per-user setting, not a per-container setting, and users
do not control where their jobs land. Scheduling decisions often put
multiple thread-heavy but non-leaking jobs from one user onto the same
machine, which again causes problems. Further, it does not help for
some of our use cases where a logical job can run as multiple UIDs for
different processes within.
More information about the Containers