[Ksummit-2010-discuss] checkpoint-restart: naked patch

Sukadev Bhattiprolu sukadev at linux.vnet.ibm.com
Mon Nov 22 10:02:41 PST 2010


Gene Cooperman [gene at ccs.neu.edu] wrote:
| > RELIABILITY     checkpoint w/ single syscall;   non-atomic, cannot find leaks
| >                 atomic operation. guaranteed    to determine restartability
| >                 restartability for containers
| 
| My understanding is that the guarantees apply for Linux containers, but not
| for a tree of processes.  Does this imply that linux-cr would have some
| of the same reliability issues as DMTCP for a tree of processes?  (I mean
| the question sincerely, and am not intending to be rude.)  In any case,
| won't DMTCP and Linux C/R have to handle orthogonal reliability issues
| such as external database, time virtualization, and other examples
| from our previous post?

Yes if the user attempts to checkpoint a partial container (what we refer
to process subtree) or fails to snapshot/restore filesystem there could be
leaks that we cannot detect.

But one guarantee we are trying to provide is that if the user checkpoints
a _complete_ container, then we will detect a leak if one exists.

Is there a way to establish a set of constraints (eg: run application in a
container, snapshot/restore filesystem) and then provide leak detection with
a pure userpsace implementation ?

Sukadev


More information about the Containers mailing list