How much of a mess does OpenVZ make? ;) Was: What can OpenVZ do?

Ingo Molnar mingo at elte.hu
Sat Mar 14 01:25:32 PDT 2009


* Alexey Dobriyan <adobriyan at gmail.com> wrote:

> On Fri, Mar 13, 2009 at 02:01:50PM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Fri, 13 Mar 2009, Alexey Dobriyan wrote:
> > > > 
> > > > Let's face it, we're not going to _ever_ checkpoint any 
> > > > kind of general case process. Just TCP makes that 
> > > > fundamentally impossible in the general case, and there 
> > > > are lots and lots of other cases too (just something as 
> > > > totally _trivial_ as all the files in the filesystem 
> > > > that don't get rolled back).
> > > 
> > > What do you mean here? Unlinked files?
> > 
> > Or modified files, or anything else. "External state" is a 
> > pretty damn wide net. It's not just TCP sequence numbers and 
> > another machine.
> 
> I think (I think) you're seriously underestimating what's 
> doable with kernel C/R and what's already done.
> 
> I was told (haven't seen it myself) that Oracle installations 
> and Counter Strike servers were moved between boxes just fine.
> 
> They were run in specially prepared environment of course, but 
> still.

That's the kind of stuff i'd like to see happen.

Right now the main 'enterprise' approach to do 
migration/consolidation of server contexts is based on hardware 
virtualization - but that pushes runtime overhead to the native 
kernel and slows down the guest context as well - massively so.

Before we've blinked twice it will be a 'required' enterprise 
feature and enterprise people will measure/benchmark Linux 
server performance in guest context primarily and we'll have a 
deep performance pit to dig ourselves out of.

We can ignore that trend as uninteresting (it is uninteresting 
in a number of ways because it is partly driven by stupidity), 
or we can do something about it while still advancing the 
kernel.

With containers+checkpointing the code is a lot scarier (we 
basically do system call virtualization), the environment 
interactions are a lot wider and thus they are a lot more 
difficult to handle - but it's all a lot faster as well, and 
conceptually so. All the runtime overhead is pushed to the 
checkpointing step - (with some minimal amount of data structure 
isolation overhead).

I see three conceptual levels of virtualization:

 - hardware based virtualization, for 'unaware OSs'

 - system call based virtualization, for 'unaware software'

 - no virtualization kernel help is needed _at all_ to 
   checkpoint 'aware' software. We have libraries to checkpoint 
   'aware' user-space just fine - and had them for a decade.

	Ingo


More information about the Containers mailing list