[Ksummit-2010-discuss] checkpoint-restart: naked patch
orenl at cs.columbia.edu
Sun Nov 7 13:59:44 PST 2010
[cc'ing linux containers mailing list]
On 11/07/2010 01:49 PM, Gene Cooperman wrote:
> Matt had asked how we would handle inotify(), but I was getting swamped
> by all the questions. There is a virtualization approach to inotify in which
> one puts wrappers around inotify_add_watch(), inotify_rm_watch() and
> friends in the same way as we wrap open() and could wrap close().
> One would then need to wrap read() (which we don't like to do, just
This sounds like reimplementation in userspace the very same logic
done by the kernel :)
> in case it could add significant overhead). But if we consider kernel
> and userland virtualization together, then something similar to TIOCSTI
> for ioctl would allow us to avoid wrapping read().
We could work to add ABIs and APIs for each and every possible piece
of state that affects userspace. And for each we'll argue forever
about the design and some time later regret that it wasn't designed
Even if that happens (which is very unlikely and unnecessary),
it will generate all the very same code in the kernel that Tejun
has been complaining about, and _more_. And we will still suffer
from issues such as lack of atomicity and being unable to do many
simple and advanced optimizations.
Or we could use linux-cr for that: do the c/r in the kernel,
keep the know-how in the kernel, expose (and commit to) a
per-kernel-version ABI (not vow to keep countless new individual
ABIs forever after getting them wrongly...), be able to do all
sorts of useful optimization and provide atomicity and guarantees
(see under "leak detection" in the OLS linux-cr paper). Also,
once the c/r infrastructure is in the kernel, it will be easy
(and encouraged) to support new =ly introduced features.
Finally, then we would use dmtcp as well as other tools on top
of the kernel-cr - and I'm looking forward to do that !
>> Hmm... can you really c/r from userspace a process that was, at
>> checkpoint time, in a ptrace-stopped state at an arbitrary kernel
>> ptrace-hook ? I strongly suspect the answer is "no", definitely
>> not unless you also virtualize and replicate the entire in-kernel
>> ptrace functionality in userspace,
> Let's try it and see. If you write a program, we'll try it out in
> DMTCP (unstable branch) and see. So far, checkpointing gdb sessions
> has worked well for us. If there is something we don't cover, it will
> be helpful to both of us to find it, and analyze that case.
Try "strace bash" :)
I suspect it won't work - and for the reasons I described.
>> (Now looking forward to discuss more details with dmtcp team on
>> Tuesday and on :)
> Also a very good point above, and I agree. The offline discussion should
> be a better forum for putting this all into perspective.
> Thanks again for your thoughtful response,
Same here. Talk to you soon...
More information about the Containers