[BIG RFC] Filesystem-based checkpoint

Serge E. Hallyn serue at us.ibm.com
Fri Oct 31 06:48:41 PDT 2008


Quoting Eric W. Biederman (ebiederm at xmission.com):
> With a file descriptor I can push the data onto a network socket and
> the receiving process is on another computer.  0 copies, 0 trips
> to user space.  I'm not certain how you would achieve that with filesystem
> approach.

This has been Oren's most convincing argument for all sorts of little
choices (his precise data format, the use of an fd and cr_kwrite()).

I wonder (a) what neat things Dave could come up with to to bridge that
gap, and (b) how much of that gap becomes less meaningful with a proper
use of pre-dump (and post-dump).

> >> Reading the memory of another process is a problem, to the point
> >> that the /proc/<pid>/mem interface has been removed from the kernel.
> >
> > Yes, this is certainly true.  All of the ptrace-related security issues
> > surely tell us something.  But, I'm not sure of your point here.  Are
> > you saying that using sys_checkpoint() to dump a process's pages is
> > inherently safer than approach that uses a filesystem in order to do the
> > same?
> 
> I'm saying inspecting another process is a very racy operation so something
> we need to be especially careful with. 

I don't see any difference there between Dave's and Oren's approaches.
In either case, the container is frozen while the kernel walks the
container's task's pages and dumps them... somewhere.

-serge


More information about the Containers mailing list