[Ce-android-mainline] [RFC] Userspace low memory killer daemon

Anton Vorontsov anton.vorontsov at linaro.org
Wed Apr 11 13:23:40 UTC 2012


[ My first email didn't make it to the list, it is still waiting for
  moderator approval; so I decided to subscribe and resend this. ]

Hi all,

We're requesting feedback on a userspace lowmmemory killer daemon
(ulmkd). It behaves the same way as kernel lowmemorykiller (LMK)
driver, except that killing policy now lives in userland, while the
kernel provides 'low memory notifications'.

ulmkd is a drop-in replacement for lowmemorykiller driver, one can
disable CONFIG_ANDROID_LOW_MEMORY_KILLER in the kernel config, or
set very high limits in /sys/module/lowmemorykiller/parameters/minfree,
which will effectively disable kernel LMK; then start ulmkd.

ulmkd is available at this git repo:

	git://git.infradead.org/users/cbou/ulmkd.git


The userland low memory killer deamon consists of two parts:

1. Low memory notifications handling.

   Currently there is only one memory usage notification mechanism in
   the kernel -- it is cgroups memory controller, memcg. There is also
   a vmevent infrastructure under development that provides more
   lightweight* notifications.

   * lightweight in sense of kernel code size, i.e. +60KB for
   cgroups[1] vs. +1K for vmevent[2]; plus, cgroups has some runtime
   memory overhead, about 0.1% of RAM[3] 

   Both low memory notification mechanisms implemented as a simple
   poll() on a Linux eventfd.

   ulmkd was made modular, so notification methods (cgroups and vmevent)
   are easily interchangeable at build time.

2. Task list management.

   Two task list management mechanisms were implemented in ulmkd:

   - /proc based (i.e. the daemon reads PIDs and oom_adj values
     from the kernel /proc directory).

   - shared memory based, where it is expected that Android Activity
     Manager would keep the task list in the memory, and share it with
     the killer daemon (the demo_shm.c file provides a small example,
     it just proxies task list from /proc to a shared memory).

     (Actually, Activity Manager already manages its own task list,
     it just does not share it.)

   As well as notifications, both these methods (/proc or shm)
   are interchangeable in ulmkd, and are chosen at a build time.
   
   Some words on efficiency, latency, etc.

   The shm based variant is more efficient, and the option
   to implement the Activity Manager part of it is always there.

   But even if we take /proc-based task list method alone, it
   would be good enough; this is because reaction time or latency
   is not actually an issue in "Low Memory Killer" duties.

   For example, let's take current LMK kernel driver. Suppose
   LMK was configured to start killing processes when free memory
   crosses 64MB thresholds. When testing, it was noticed, that
   kernel LMK driver does not actually start killing processes.

   This is because kernel LMK uses 'shrinker API', the shrinker
   notifier only gets called when memory pressure is very high
   (i.e. kernel asks slab users to free its unneeded caches, to
   avoid swapping). On the tests setups shrinker callback gets
   called when we're almost out of memory, i.e. far beyond 64MB
   limit.

   This issue has seemingly always been there, and it insinuates
   that reaction time to the specified thresholds isn't as
   critical as one would otherwise expect.

   What is important, is a reaction time upon true "out of
   memory" condition, when there is really no memory available.

   But in that case we still have in-kernel Out Of Memory Killer
   (OOMK), which is maximally fast.

   And since the Activity Manager still manages per-process
   oom_adj values, that means that in any real OOM event, OOM
   killer will kill the right task: the lesser important task
   per oom score and oom_adj.


Now, why ulmkd was implemented as a stand-alone daemon instead
of leaving things as is (i.e. kernel LMK) or integrating it into
Android Activity Manager. There are a few reasons:

- The userland daemon is how the vast amount of the Linux kernel
  community members sees the low memory killer implemented. The
  Linux community has no problem w/ giving some generic means to
  notify userland about low memory conditions (and we have
  cgroups and vmevent), but nobody wants to see any policy in
  the kernel.

  LMK is too ad-hoc solution, while low memory notifications
  is a highly demanded generic feature (and as far as I know,
  some sort of userland notifications is already used by Nokia
  in their N9 product).

  And finally, when implemented correctly, LMK would very much
  duplicate either vmevent or memcg.

- As of integrating ulmkd into Activity Manager... There are two
  reasons to not to:

  a) We don't want to introduce JNI for the cgroups and vmevent.
     This brings a lot of complexity and makes Java code too much
     tied w/ Linux internals.

  b) In JVM we can't guarantee 'no new new memory allocations',
     we're out of control of what JVM does with memory. We don't
     want the killer to be swapped out, so for ulmkd we'd call
     mlockall() on a small binary, but mlocking() whole JVM
     is not acceptable.

     Or, suppose we should kill a process, but then JVM code that
     issues kill() needs the memory itself. (Actually, the similar
     problem exist in '/proc-based task list backend' in ulmkd,
     as opendir()/readdir() will need memory. But with shm approach
     this is non-issue.)

p.s. And some numbers to compare binary sizes of LMK and ulmkd, if
anyone cares:

~/src/linux/linux-cbou$ file drivers/staging/android/lowmemorykiller.o
drivers/staging/android/lowmemorykiller.o: ELF 32-bit LSB relocatable, ARM, version 1, not stripped
~/src/linux/linux-cbou$ size drivers/staging/android/lowmemorykiller.o
   text    data     bss     dec     hex filename
   1240     104       8    1352     548 drivers/staging/android/lowmemorykiller.o

~/src/ulmkd$ file ulmkd
ulmkd: ELF 32-bit LSB executable, ARM, version 1 (SYSV), dynamically linked (uses shared libs), stripped
~/src/ulmkd$ size ulmkd
   text    data     bss     dec     hex filename
   7568     480      32    8080    1f90 ulmkd
~/src/ulmkd$ size ulmkd.o
   text    data     bss     dec     hex filename
   2256      96       4    2356     934 ulmkd.o

I.e. 8 KB (linked) vs. 1 KB


[1] http://lkml.org/lkml/2011/12/20/457
[2] ~/src/linux/linux-vmevent$ size mm/vmevent.o
   text    data     bss     dec     hex filename
   1307       0      16    1323     52b mm/vmevent.o
[3] http://lkml.org/lkml/2011/12/19/562


More information about the Ce-android-mainline mailing list