[Ksummit-2010-discuss] checkpoint-restart: naked patch
sukadev at linux.vnet.ibm.com
Mon Nov 22 10:02:41 PST 2010
Gene Cooperman [gene at ccs.neu.edu] wrote:
| > RELIABILITY checkpoint w/ single syscall; non-atomic, cannot find leaks
| > atomic operation. guaranteed to determine restartability
| > restartability for containers
| My understanding is that the guarantees apply for Linux containers, but not
| for a tree of processes. Does this imply that linux-cr would have some
| of the same reliability issues as DMTCP for a tree of processes? (I mean
| the question sincerely, and am not intending to be rude.) In any case,
| won't DMTCP and Linux C/R have to handle orthogonal reliability issues
| such as external database, time virtualization, and other examples
| from our previous post?
Yes if the user attempts to checkpoint a partial container (what we refer
to process subtree) or fails to snapshot/restore filesystem there could be
leaks that we cannot detect.
But one guarantee we are trying to provide is that if the user checkpoints
a _complete_ container, then we will detect a leak if one exists.
Is there a way to establish a set of constraints (eg: run application in a
container, snapshot/restore filesystem) and then provide leak detection with
a pure userpsace implementation ?
More information about the Containers