[RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work

Fri Oct 10 09:34:49 PDT 2008

On Fri, 2008-10-10 at 11:17 -0400, Oren Laadan wrote:
> 
> Ingo Molnar wrote:
> > * Daniel Lezcano <dlezcano at fr.ibm.com> wrote:
> > 
> >>> By the way, why don't you introduce the reverse operation ?
> >> I think implementing the reverse operation will be a nightmare, IMHO 
> >> it is safe to say we deny checkpointing for the process life-cycle 
> >> either if the created resource was destroyed before we initiate the 
> >> checkpoint.
> > 
> > it's also a not too interesting case. The end goal is to just be able to 
> > checkpoint everything that matters - in the long run there simply wont 
> > be many places that are marked 'cannot checkpoint'.
> > 
> > So the ability to deny a checkpoint is a transitional feature - a 
> > flexible CR todo list in essence - but also needed for 
> > applications/users that want to rely on CR being a dependable facility.
> > 
> > It would be bad for most of the practical usecases of checkpointing to 
> > allow the checkpointing of an app, just to see it break on restore due 
> > to lost context.
> 
> Actually it need not wait for restore to fail - it can fail during the
> checkpoint, as soon as the unsupported feature is encountered.
> 

Of course, bad things must be spotted at checkpoint time ! :)

> Adding that flag of what you suggest will help make it more vocal and
> obvious that a feature isn't supported, even without the user actually
> trying to take a checkpoint. I  like that I idea.
> 

This flag is weak... testing it gives absolutly no hint whether the
checkpoint may succeed or not. As it is designed now, a user can only be
aware that checkpoint is *forever* denied. I agree that it's only useful
as a "flexible CR todo list".

In the long run, if there are still things that can prevent checkpoint
from being consistent, they will have to be checked at checkpoint time.

Greg.