How much of a mess does OpenVZ make? ;) Was: What can OpenVZ do?

Serge E. Hallyn serue at
Sun Mar 1 12:02:31 PST 2009

Quoting Alexey Dobriyan (adobriyan at
> On Fri, Feb 27, 2009 at 01:31:12AM +0300, Alexey Dobriyan wrote:
> > This is collecting and start of dumping part of cleaned up OpenVZ C/R
> > implementation, FYI.
> OK, here is second version which shows what to do with shared objects
> (cr_dump_nsproxy(), cr_dump_task_struct()), introduced more checks
> (still no unlinked files) and dumps some more information including
> structures connections (cr_pos_*)
> Dumping pids in under thinking because in OpenVZ pids are saved as
> numbers due to CLONE_NEWPID is not allowed in container. In presense
> of multiple CLONE_NEWPID levels this must present a big problem. Looks
> like there is now way to not dump pids as separate object.
> As result, struct cr_image_pid is variable-sized, don't know how this will
> play later.
> Also, pid refcount check for external pointers is busted right now,
> because /proc inode pins struct pid, so there is almost always refcount
> vs ->o_count mismatch.
> No restore yet. ;-)

Hi Alexey,

thanks for posting this.  Of course there are some predictable responses
(I like the simplicity of pure in-kernel, Dave will not :) but this
needs to be posted to make us talk about it.

A few more comments that came to me while looking it over:

1. cap_sys_admin check is unfortunate.  In discussions about Oren's
patchset we've agreed that not having that check from the outset forces
us to consider security with each new patch and feature, which is a good

2. if any tasks being checkpointed are frozen, checkpoint has the
side effect of thawing them, right?

3. wrt pids, i guess what you really want is to store the pids from
init_tsk's level down to the task's lowest pid, right?  Then you
manually set each of those on restart?  Any higher pids of course
don't matter.

4. do you have any thoughts on what to do with the mntns info at
restart?  Will you try to detect mounts which need to be re-created?

5. Since you're always setting f_pos, this won't work straight over
a pipe?  Do you figure that's just not a worthwhile feature?

Were you saying (in response to Dave) that you're having private
discussions about whether to pursue posting this as an alternative
to Oren's patchset?  If so, any updates on those discussions?


