[RFC][PATCH 1/4] checkpoint-restart: general infrastructure

Serge E. Hallyn serue at us.ibm.com
Mon Aug 11 08:07:03 PDT 2008


Quoting Dave Hansen (dave at linux.vnet.ibm.com):
> On Sat, 2008-08-09 at 00:13 +0200, Arnd Bergmann wrote:
> > > I have to wonder if this is just a symptom of us trying to do this the
> > > wrong way.  We're trying to talk the kernel into writing internal gunk
> > > into a FD.  You're right, it is like a splice where one end of the pipe
> > > is in the kernel.
> > > 
> > > Any thoughts on a better way to do this?  
> > 
> > Maybe you can invert the logic and let the new syscalls create a file
> > descriptor, and then have user space read or splice the checkpoint
> > data from it, and restore it by writing to the file descriptor.
> > It's probably easy to do using anon_inode_getfd() and would solve this
> > problem, but at the same time make checkpointing the current thread
> > hard if not impossible.
> 
> Yeah, it does seem kinda backwards.  But, instead of even having to
> worry about the anon_inode stuff, why don't we just put it in a fs like
> everything else?  checkpointfs!

One reason is that I suspect that stops us from being able to send that
data straight to a pipe to compress and/or send on the network, without
hitting local disk.  Though if the checkpointfs was ram-based maybe not?

As Oren has pointed out before, passing in an fd means we can pass a
socket into the syscall.

Using the anon_inodes would also prevent that, but if it makes for a
cleaner overall solution then I'm not against considering either one
of course.

> I'm also really not convinced that putting the entire checkpoint in one
> glob is really the solution, either.  I mean, is system call overhead
> really a problem here?
> 
> -- Dave
> 
> _______________________________________________
> Containers mailing list
> Containers at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/containers


More information about the Containers mailing list