[RFC][PATCH] x86_86 support of checkpoint/restart (Re: Checkpoint / Restart)

Fri Mar 20 10:21:39 PDT 2009

On Tue, Mar 17, 2009 at 11:56 PM, Oren Laadan <orenl at cs.columbia.edu> wrote:
>
> I was very confued byt he original post; there is no need for a special
> signal (handling) for checkpoint.
>
> To checkpoint, we first freeze (or stop) the processes, meaning that
> they are kept with empty kernel stack before returning to user-mode.
>
> We then rely on the fact that a process saves everything that it needs
> before entering a syscall - so whatever is on the stack when it enters
> the kernel must be preserved, the rest can be overwritten. Otherwise,
> processes wouldn't survive context switches while in syscalls ...

Well in x86_64, everything that process needs is not saved on the
stack before entering the system call, for example, callee saved
registers (rbx, rbp, r12, r13, r14, r15). If there registers are used
anywhere in the kernel, they would be saved and restored from the
stack. On context switch, these registers are explicitly clobbered, so
that they are saved on the kernel stack of the outgoing process.

Anyways, with the stubs that I introduced in my patch, these registers
are saved before entering the system call, so the problem is solved. I
am now working on checkpointing/restoring 32-bit binaries on 64-bit
kernel (i.e compatibility mode). It is working with internal
checkpointing, but results in a seg fault in user mode after restore
for external checkpoint. I will post the patches as soon as I nail it
down.

>
> Oren.
>
> Nauman Rafique wrote:
>> Actually looking at the code in entry_64.S again closely, external
>> checkpointing should work with my patch too. The callee save registers
>> -- rbx, rbp, r12, r13, r14, r15 -- are saved on the kernel stack
>> before calling signal handling code (i.e. right before switching from
>> kernel to user mode). This signal handling code would be called
>> whenever we are trying to checkpoint a process with SIGSTOP or cgroup
>> freezer. Thus these registers would be on the kernel stack of
>> checkpointed process. And we don't need any user level signal handling
>> for external checkpointing to work in x86_64. Sorry for causing
>> confusion.
>>
>> On Tue, Feb 10, 2009 at 2:27 PM, Nauman Rafique <nauman at google.com> wrote:
>>> On Mon, Feb 9, 2009 at 10:02 AM, Dave Hansen <dave at linux.vnet.ibm.com> wrote:
>>>> On Fri, 2009-02-06 at 16:17 -0800, Nauman Rafique wrote:
>>>>> The patch sent by Masahiko assumes that all the user-space registers
>>>>> are saved on
>>>>> the kernel stack on a system call. This is not true for the majority
>>>>> of the system calls. The callee saved registers (as defined by x86_64
>>>>> ABI) - rbx, rbp, r12, r13, r14, r15 - are saved only in some special
>>>>> cases. That means that these registers would not be available to
>>>>> checkpoint code. Moreover, the restore code would have no space in
>>>>> stack to restore those registers.
>>>> According to this:
>>>>
>>>> http://msdn.microsoft.com/en-us/library/6t169e9c(VS.80).aspx
>>>>
>>>> Those registers all get clobbered on all function calls.  I assume that
>>>> userspace also considers them to get clobbered on system calls as
>>>> well.
>>>>
>>>> What are those special cases you are talking about?  Certain special
>>>> cases for entering the kernel where we do save those registers?
>>> There are the system calls the use the same stub that I have used to
>>> save the full stack (and thus all the registers).
>>>        sys_clone
>>>        sys_fork
>>>        sys_vfork
>>>        sys_sigaltstack
>>>        sys_iopl
>>>
>>>> Signal handling and ptrace single stepping are two places I would
>>>> imagine we have to enter the kernel and preserve those registers.  Is
>>>> that why you were suggesting overloading signal delivery?
>>>>
>>>> Thanks for pointing out the problem, though.  This one will be
>>>> interesting. :)
>>>>
>>>> -- Dave
>>>>
>>>>
>>
>