checkpoint/restart: taking refcounts on kernel objects

Alexey Dobriyan adobriyan at gmail.com
Fri May 1 05:56:05 PDT 2009


On Tue, Apr 14, 2009 at 10:23:20AM -0700, Dave Hansen wrote:
> On Tue, 2009-04-14 at 21:04 +0400, Alexey Dobriyan wrote:
> > > Right while I have opinions on some things in this list, I didn't
> > > mean to imply positions on these items.  My question was:  are
> > > there are differences you want to call out?
> > 
> > Sorry? "none needed" is relevant to only item 3. If tasks don't
> > dissapear during checkpoint, why would netns dissapear.
> > Taking refcount on checkpoint(2) is likely unneeded.
> > 
> > But it's low-level detail anyway.
> 
> I guess it is a matter of whether we consider a task that gets unfrozen
> a kernel bug or not.  If we don't take refcounts and we do reference an
> object that disappears, then we *certainly* have a kernel bug that can
> crash the kernel.  If we take refcounts, we at least limit the ways in
> which the kernel can crash when something screwy happens.
> 
> On the other hand, the objhash is a kinda weird way to do it.  Taking
> and releasing arbitrary refcounts on arbitrary kernel objects one level
> too much of abstraction for me.

Hm, I take this objection back (refcounts at checkpoint(2) time).
It's easier and safer to always grab it when putting checkpointed object
to hash/list/whatever to maintain refcount correct.
On context destroy, every object is put regardless of whether it's
checkpointing or restarting.


More information about the Containers mailing list