[BIG RFC] Filesystem-based checkpoint

Dave Hansen dave at linux.vnet.ibm.com
Thu Oct 30 13:40:25 PDT 2008


On Thu, 2008-10-30 at 16:15 -0400, Oren Laadan wrote:
> Dave Hansen wrote:
> > This is a blob.  It's simply a blob exported in a filesystem.  Note that
> > it exports the same format as the 'big blob' with the same types.  Stick
> > a couple of cr_hdr* objects on to what we have in the filesystem, and we
> > get the same blob that we have now.
> > 
> > How would a tarball of this filesystem be any less of a blob than the
> > output from sys_checkpoint() is now?
> 
> It isn't a blob per se - it exposes the structure via the file system;
> tomorrow someone will write a program that relies on that structure, and
> the next time you wanna change something you open a can of worms.

This is an ABI that I'm proposing.  But, so is the blob from the
syscall.  Are you saying that people can't write programs that depend on
the structure of the data returned from the syscall?  

> How likely is this to happen if you used, for instance, a single file in
> your file system approach ?

Definite.  Just as people will write programs to access only parts of
the resultant sys_checkpoint() files.

> > If we were doing a configfs-style restart, the restarter would simply
> > restore those two files.  The act of doing open(O_CREAT) is the same
> > trigger as what you have now when a cr_hdr of some type is encountered.
> 
> What you did not address in your response, is that the thing with shared
> resources is that they appear more than once. In your terminology, they
> would show up in multiple places in the tree. Then they would be saved
> multiple times ?

*References* will show up more than once.  But, filesystems handle
references today with symlinks or hard links.  We could either do that
or force userspace to do the duplicate and sharing detection itself.

You could get nice and creative here.  For instance, look at the link
count on a file.  If it is 1, write out the record for that resource
into the checkpoint file.  If it is >1, then write out a reference and
unlink the file.  

-- Dave



More information about the Containers mailing list