[Ksummit-2010-discuss] checkpoint-restart: naked patch

Tejun Heo tj at kernel.org
Fri Nov 19 07:33:25 PST 2010


Hello,

On 11/19/2010 03:36 PM, Kirill Korotaev wrote:
> Can you imagine how many userland APIs are needed to make userspace C/R?
> 
> Do you really want APIs in user-space which allow to:
> - send signals with siginfo attached (kill() doesn't work...)

Doesn't rt_sigqueueinfo() already do this?

> - read inotify configuration

This would be nice even apart from CR.

> - insert SKB's into socket buffers

Can't we drain kernel buffers?  ie. Stop further writing and wait the
send-q to drop to zero.

> - setup all TCP/IP parameters for sockets

I _think_ most can be restored by talking to netfilter module.
Setting outgoing sequence number might be beneficial tho.

> - wait for AIO pending in other processes

I haven't looked at aio implementation for a while now but can't we
drain these upon checkpointing and just carry the completion status?
Also, if aio is what you're concerned about, I would say the problem
is mostly solved.

> - setting different statistics counters (like netdev stats etc.)
> and so on...

Why would this matter?

> For every small piece of functionality you will need to export ABI
> and maintain it forever.  It's thousands of APIs! And why the hell
> they are needed in user space at all?

I think it's actually quite the contrary.  Most things are already
visible to userland.  They _have_ to be and that's the reason why
userland implementation can already get most things working without
any change to the kernel with some amount of hackery.  To me in-kernel
CR seems to approach the problem from the exactly wrong direction -
rather than dealing with specific exceptions, it create a completely
new framework which is very foreign and not useful outside of CR.

Also, think about it.  Which one is better?  A kernel which can fully
show its ABI visible states to userland or one which dumps its
internal data structurs in binary blobs.  To me, the latter seems
multiple orders of magnitude uglier.

> BTW, HPC case you are talking about is probably the simplest
> one.

Yet, it is one of the the most important / relevant use cases.

> Last time I looked into it, IBM Meiosis c/r didn't even bother with
> tty's migration.  In OpenVZ we really do need much more then that
> like autofs/NFS support, preserve statistics, TTYs, etc. etc. etc.

Would it be impossible to preserve autofs/NFS and TTYs from userland?
Then, why so?  For statistics, I'm a bit lost.  Why does it matter and
even if it does would it justify putting the whole CR inside kernel?

Thank you.

-- 
tejun


More information about the Containers mailing list