checkpoint/restart ABI

Eric W. Biederman ebiederm at xmission.com
Thu Aug 28 16:40:21 PDT 2008


"Serge E. Hallyn" <serue at us.ibm.com> writes:

> Quoting Peter Chubb (peterc at gelato.unsw.edu.au):

>> Beefing up ptrace or fixing /proc to be a real debugging interface
>> would be a start ... when you can get at *all* the info you need,
>
> Except we don't really want to export all the info you need for a
> complete restartable checkpoint.  And especially not make it
> generally writable.

That and unless we get a lot of synergy from authors of debuggers
and debugging code it is a more general and slower interface for
no apparent gain.

> We have also started down that path using ptrace (see cryo, at
> git://git.sr71.net/~hallyn/cryodev.git).
>
> Right before the containers mini-summit, where the general agreement was
> that a complete in-kernel solution ought to be pursued, I had tried
> a restart using a binary format that read a checkpoint file and used
> cryo (userspace using ptrace) for the rest of the restart, only
> because there was no other reasonable way to set tsk->did_exec on
> restart.

Can we please describe this as the giant syscall approach.  Instead
of a complete in-kernel solution.  There are things like filesystems
that should be checkpointed separately, or not checkpointed at all.

However there is a large set of processes and process state that always
goes together and if you checkpoint a container you always want.

So building something that is roughly equivalent to a binfmt module
but that can save and restore multiple tasks with a single operation
looks like the right granularity.

>> Jeremy> Lightweight filesystem checkpointing, such as btrfs provides,
>> Jeremy> would seem like a powerful mechanism for handling a lot of the
>> Jeremy> filesystem state problems.  It would have been useful when we
>> Jeremy> did this...
>> 
>> And how!  saving bits of files was very timeconsuming.
>
> Yes, we're looking forward to using btrfs' snapshots :)

Yep.  And in the case of migration we don't even need to snapshot
a filesystem just mount it from on the target machine.  Except for
the unlinked files challenge.

Eric


More information about the Containers mailing list