[RFC] [PATCH] Cgroup based OOM killer controller

Nikanth Karthikesan knikanth at suse.de
Wed Jan 21 22:42:23 PST 2009


On Thursday 22 January 2009 11:59:20 Arve Hjønnevåg wrote:
> On Wed, Jan 21, 2009 at 10:12 PM, Nikanth Karthikesan <knikanth at suse.de> 
wrote:
> > On Thursday 22 January 2009 11:09:45 Arve Hjønnevåg wrote:
> >> On Wed, Jan 21, 2009 at 9:13 PM, Nikanth Karthikesan <knikanth at suse.de>
> >
> > wrote:
> >> > To use oom_adj effectively one should continuously monitor oom_score
> >> > of all the processes, which is a complex moving target and keep on
> >> > adjusting the oom_adj of many tasks which still cannot guarantee the
> >> > order. This controller is deterministic and hence easier to use.
> >>
> >> Why not add an option to make oom_adj ensure strict ordering instead?
> >
> > This could be done in 2 ways.
> > 1. Make oom_adj itself strict.(based on some other parameter?)
> > - Adds to confusion whether the current oom_adj is a strict value or the
> > usual suggestion.
> > - It would disable the oom_adj suggestion which could have been used till
> > now. - It is a public interface, and changing that might break some one's
> > script.
> >
> > 2. Add addtional parameter, say  /proc/<pid>/oom_order
> > - Not easy to use.
> > - Say I had assigned the oom.victim to a task and it had forked a lot.
> > Now to change the value for all the tasks it is easier with cgroups.
> > - Some optimization that Kame specified earlier would be harder to
> > achieve.
>
> Both options would work for us, but option 1 require no change to our
> user space code.

Some scripts might be assuming the oom_adj will always be -17 to +15. So not 
more than 32+1 levels or order is possible. Yes it should be enough mostly. 
But incase you want to leave space between for adding tasks in between, one 
has to take extra care or do more work. And someone might still assume old 
behaviour and by seeing oom_score and oom_adj he might expect a different 
behaviour. And if someone wants the old behaviour, we have to provide an 
aditional variable to enable/disable this.

> I agree that some operations are easier with a
> cgroups approach, but since we don't perform these operations it would
> be nice to not require cgroups to control the oom killer.

We would perform these operations if these would be available and easier to 
use, so it would be nice to use cgroups.

Thanks
Nikanth


More information about the Containers mailing list