[PATCH 1/3] powerpc: bare minimum checkpoint/restart implementation

Cedric Le Goater legoater at free.fr
Mon Mar 16 23:55:37 PDT 2009

> Again, how would 'cr' obtain exit status for these tasks, and how would
> it distinguish failure from normal operation?

Here's our solution to this issue.

mcr maintains in its kernel container object an exitcode attribute for 
the mcr-restart process. This process is detached from the fork tree of 
the restarted application.  

when the restart is finished, an mcr-wait command can be called to reap 
this exitcode. This make it possible to distinguish an exit of the 
application process from an exit of the mcr-restart process.

This is a must-have for batch managers in an HPC environment. 



More information about the Containers mailing list