[RFC][PATCH 00/11] track files for checkpointability
Serge E. Hallyn
serge at hallyn.com
Thu Mar 12 08:30:48 PDT 2009
Quoting Cedric Le Goater (legoater at free.fr):
> >> And if Ingo's requirement is fulfilled, would any C/R patchset be acceptable ?
> > Yup, no matter how hideous :) Ok not really.
> > But the point was that it wasn't Dave not understanding Alexey's
> > suggestion, but Greg not understanding Ingo's. If you think Ingo's
> > goal isn't worthwhile or achievable, then argue that (as I am), don't
> > keep elaborating on something we all agree will be needed (Alexey's
> > suggestion or some other way of doing a true may-be-checkpointed test).
> I rather spend my time on enabling things rather than forbid them.
That sure sounds productive. How could I argue with that.
But wait, haven't several teams been doing that for years? So why is
c/r not in the upstream kernel? Could it be that ignoring the
upstream maintainers' concerns about (a) treating the feature as a
toy, (b) long-term maintainability, and (c) c/r becoming an impediment
to future features, and instead hacking away at our toy feature, is
*not* always the best course?
Now I actually don't think you believe it's politically possible for
c/r to get upstream. But I do.
Getting back on track. Ingo's concern is that this turn into a toy feature
rather than something which serious apps can rely on. To address that
he wants an application to be able to tell whether or not it is
checkpointable, and, if not, then why not.
Now Alexey also has a suggestion for addressing this. It has (at least)
two shortcomings relative to Dave's:
1. it is more prone to race conditions. If an app opens an
uncheckpointable file briefly and then closes it, and later reopens
it, then it may think it is checkpointable even though it could have
already known it is not always. If you want to argue that returning
-EAGAIN is better in that case, that seems reasonable.
2. For repeatedly checking the checkpointability of large
applications it could be much more costly. For instance, if we have
to check the flags on each vma, and an application has 10s or 100s of
gigs of memory, each check for checkpointability would require walking
all those vmas each time. Dave's approach has the advantage of only
checking those when the resource is opened.
3. One of the things Ingo likes about Dave's approach, which
you may think is bogus, is that users of an uncheckpointable application
will scream more loudly if the app becomes permanently uncheckpointable
(and they know why), than if it sometimes works and sometimes doesn't
The funny thing is, for simplicity I actually prefer Alexey's approach.
It's easier (and therefore seems more robust) to tell if a task has a
particular sort of file at a definite point in time, than to try and
catch all the ways such an open file can be opened or received. Which
is where Ingo's LSM suggestion is seductive, but I'm convinced that
approach would be politically impossible.
More information about the Containers