[cgl_discussion] POC 6/5 Meeting Minutes

Mika Kukkonen mika at osdl.org
Thu Jun 5 14:26:02 PDT 2003

On Thu, 2003-06-05 at 10:02, Steven Dake wrote:
> 2.7 wishlist ARs

Here is my first iteration of my AR's. Let me know if something is

> AR Mika to provide kexec text.

Ref: http://www.xmission.com/~ebiederm/files/kexec/
  From Eric Biederman's README file:
   "kexec is a set of systems call that allows you to load another 
    kernel from the currently executing Linux kernel."
  In the kernel fault situation loading having the new replacement
  kernel in memory and booted by kexec a significant savings on node
  downtime is achieved. For more information, see Andy's OLS paper:
  kexec was in -mm for a while, but was dropped in 2.5.70, and while
  Andy will keep pushing it it is listed as a lowest priority in Andrew
  Morton's "must-fix" list, so it is likely that it will not make it into
  2.6. Also Eric Biederman seems to engaged with other projects and his
  current code does not apply cleanly on 2.5.70, while the "branch" Andy
  has does.

> AR Mika to bring up network dump to specs group.

Ref: http://lkcd.sourceforge.net/
  For many reasons it is very important to know why a CGL node failed
  (as a kind of "black box" feature). In a cluster environment it is
  preferably to have that "black box" (i.e. the dump) done to a 
  centralized place, which means dumping over LAN. For more details see:
  LKCD-project seems to be in hibernation, at least latest updates on 
  SourceForge are from October last year, when there was lot of
  discussion about LKCD between Linus and LKCD developers. See for
  example this thread:
  RedHat has netdump in their distribution, and source is available,
  but there is no Open Source project behind it.

> AR Mika to provide text for Application restart text.
> AR Mika to clarify difference between rapid announcement of process 
> death and application restart.

Well, it seems that "application restart" has more or less disappeared
from our v2.0 spec. And from kernel point-of-view what is relevant
seems to be the prochadd functionality (the "real" requirement behind
this "rapid announcement of process death"):

in-kernel process monitoring:
Ref: http://nscp.upenn.edu/aix4.3html/libs/ktechrf1/prochadd.htm

  There is a need for very quickly (in range of tens of milliseconds)
  for an supervising process to notice a death (etc.) of a a process.
  Traditionally this would be done by having the process spawn the
  application processes, but this requires modification of application
  code which in majority of cases is not an option.
  Another way would be to "reparent" application processes to this
  supervisor, see this thread on LKML:
  Third way is to monitor procfs, but this seems to have several 
  performance issues. Fourth one is to use ptrace ("man 2 ptrace"):
  but that probably has some "issues" (not tested).

>AR Mika to bring up application preloading kernel changes to specs
> AR Mika to bring up page flushing to specs subgroup.

These two are also closely related. The preloading is basically a glibc
patch (which while important is not a kernel issue) and a way to lock
(or "pin") _all_ memory pages of the loaded application into the memory
(what's the point of loading it all if the pages get swapped out?).
Page flushing complements that by adding a way for (possibly non-root,
although that may not be realistic) user/application to flush some of
those locked pages out.

User space memory page handling:
Ref: <none>

  In CGL system there is a strong need to avoid run-time latencies 
  relating to loading code pages from disk (or over network) by loading
  them all at the node startup. On the other hand there needs to way
  for a supervising process to force such an application to unload some
  of those locked pages (for example to allow loading a new, more 
  important application).
  While there are some existing mechanisms in the kernel, nothing 
  currently exists that fully implements the requirement. Reason why
  we feel this is 2.7 material is that it should be reasonable simple
  to implement and has potentially large usage base.


More information about the cgl_discussion mailing list