[RFC v6][PATCH 0/9] Kernel based checkpoint/restart

Ingo Molnar mingo at elte.hu
Thu Oct 9 06:17:01 PDT 2008


* Dave Hansen <dave at linux.vnet.ibm.com> wrote:

> On Thu, 2008-10-09 at 14:46 +0200, Ingo Molnar wrote:
> > * Oren Laadan <orenl at cs.columbia.edu> wrote:
> > 
> > > These patches implement basic checkpoint-restart [CR]. This version 
> > > (v6) supports basic tasks with simple private memory, and open files 
> > > (regular files and directories only). Changes mainly cleanups. See 
> > > original announcements below.
> > 
> > i'm wondering about the following productization aspect: it would be 
> > very useful to applications and users if they knew whether it is safe to 
> > checkpoint a given app. I.e. whether that app has any state that cannot 
> > be stored/restored yet.
> 
> Absolutely!
> 
> My first inclination was to do this at checkpoint time: detect and 
> tell users why an app or container can't actually be checkpointed.  
> But, if I get you right, you're talking about something that happens 
> more during the runtime of the app than during the checkpoint.  This 
> sounds like a wonderful approach to me, and much better than what I 
> was thinking of.
> 
> What kind of mechanism do you have in mind?
> 
> int sys_remap_file_pages(...)
> {
> 	...
> 	oh_crap_we_dont_support_this_yet(current);
> }
> 
> Then the oh_crap..() function sets a task flag or something?

yeah, something like that. A key aspect of it is that is has to be very 
low-key on the source code level - we dont want to sprinkle the kernel 
with anything ugly. Perhaps something pretty explicit:

  current->flags |= PF_NOCR;

as we do the same thing today for certain facilities:

  current->flags |= PF_NOFREEZE;

you probably want to hide it behind:

  set_current_nocr();

and have a set_task_nocr() as well, in case there's some proxy state 
installed by another task.

Via such wrappers there's no overhead at all in the 
!CONFIG_CHECKPOINT_RESTART case.

Plus you could drive the debug mechanism via it as well, by using a 
trivial extension of the facility:

  set_current_nocr("CR: sys_remap_file_pages not supported yet.");
  ...
  set_task_nocr(t, "CR: PI futexes not supported yet.");

	Ingo


More information about the Containers mailing list