[RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work

Daniel Lezcano dlezcano at fr.ibm.com
Fri Oct 10 01:37:33 PDT 2008


Greg Kurz wrote:
> On Thu, 2008-10-09 at 12:04 -0700, Dave Hansen wrote:
>> Suggested by Ingo.
>>
>> Checkpoint/restart is going to be a long effort to get things working.
>> We're going to have a lot of things that we know just don't work for
>> a long time.  That doesn't mean that it will be useless, it just means
>> that there's some complicated features that we are going to have to
>> work incrementally to fix.
>>
>> This patch introduces a new mechanism to help the checkpoint/restart
>> developers.  A new function pair: task/process_deny_checkpoint() is
>> created.  When called, these tell the kernel that we *know* that the
>> process has performed some activity that will keep it from being
>> properly checkpointed.
>>
>> The 'flag' is an atomic_t for now so that we can have some level
>> of atomicity and make sure to only warn once.
>>
>> For now, this is a one-way trip.  Once a process is no longer
>> 'may_checkpoint' capable, neither it nor its children ever will be.
>> This can, of course, be fixed up in the future.  We might want to
>> reset the flag when a new pid namespace is created, for instance.
>>
> 
> Then this patch should be described as:
> 
> Track in-kernel when we expect checkpoint/restart to fail.
> 
> By the way, why don't you introduce the reverse operation ?

I think implementing the reverse operation will be a nightmare, IMHO it 
is safe to say we deny checkpointing for the process life-cycle either 
if the created resource was destroyed before we initiate the checkpoint.

For example, you create a socket, the process becomes uncheckpointable, 
you close (via sys_close) the socket, you have to track this close to be 
related to the socket which made the process uncheckpointable in order 
to make the operation reversible.

Let's imagine you implement this reverse operation anyway, you have a 
process which creates a TCP connection, writes data and close the socket 
(so you are again checkpointable), but in the namespace there is the 
orphan socket which is not checkpointable yet and you missed this case.


More information about the Containers mailing list