[RFC][PATCH 0/7 + tools] Checkpoint/restore mostly in the userspace

Matt Helsley matthltc at us.ibm.com
Tue Jul 26 15:59:11 PDT 2011


On Sat, Jul 23, 2011 at 05:53:46AM +0200, Tejun Heo wrote:
> Hello,
> 
> On Sat, Jul 23, 2011 at 2:25 AM, Matt Helsley <matthltc at us.ibm.com> wrote:
> > Then there's the matter of unlinked files. How do you plan to deal
> > with those without kernel code?
> 
> /proc/PID/fd already provides access to deleted files perfectly well
> as most avid p0rn watchers would know (you can run mplayer on flash's
> deleted temp files). ;)

Yup, access to the unlinked file contents. This is an example where
things appear simple and complete in /proc yet it is insufficient.
Here's what you'll need:

The string "(deleted)" in a file name is, strictly speaking, ambiguous --
it does not mean the file is unlinked. You also can't infer that it is
unlinked by stat()'ing that path since a different file could have
been created in the same spot. For something unambiguous you'll
have to add that information to /proc somewhere. fdinfo doesn't seem
to be the right place since fds aren't unlinked -- files are. 

Then you've got to detect when they're the same unlinked file and share
the copy upon restart. Or they could be different unlinked files
with the same path in which case you should not share the copy. I suppose
you'll have to check the device and inode and then see if any other task
being checkpointed has it open... once for each of potentially thousands
of fds being checkpointed.

Then there's the case where you've got one unlinked dentry for the
file but a hardlink elsewhere. The /proc/PID/fd path won't point to the
hardlinked location. So in order for those to be the same file upon
restart you need to find the file somehow during checkpoint and/or
restart.

Finally these files often can be huge. Copying them elsewhere is a huge IO
burden compared to careful relinking of the file. IO that could be better
spent doing actual work.

We solved all that with "relinking". It's possible to make a relink()
syscall. The code I posted some time ago to containers@ can be easily
adapted for that -- I did so for my testing of those patches. I'm not
exactly sure how it would be done from userspace but I suspect it could
be done.

Perhaps you'll find a different and better way to solve all those
problems unlinked files present. I'd sincerely like to hear about it.

Cheers,
	-Matt


More information about the Containers mailing list