[RFC][PATCH 00/11] track files for checkpointability
dave at linux.vnet.ibm.com
Fri Mar 6 08:46:05 PST 2009
On Fri, 2009-03-06 at 10:23 -0600, Serge E. Hallyn wrote:
> Which imo is fine, but my question is whether that leaves any actual
> value in the persistent per-resource uncheckpointable flag.
OK, let's take a look back at this discussion a little bit and how we
> Yeah, per resource it should be. That's per task in the normal
> case - except for threaded workloads where it's shared by
> Uncheckpointable should be a one-way flag anyway. We want this
> to become usable, so uncheckpointable functionality should be as
> painful as possible, to make sure it's getting fixed ...
> Is there any automated test that could discover C/R breakage via
> brute force? All that matters in such cases is to get the "you
> broke stuff" information as soon as possible. If it comes at an
> early stage developers can generally just fix stuff.
You add these things together and you get what I posted. My patch is:
1. per resource
2. has a one way flag
3. Gives messages to developers at an early stage (dmesg) and lets them
explore it more thoroughly (/proc)
But, these "early stage" messages are completely opposed to an approach
that uses sys_checkpoint() in some form (like with a -1 fd as an
Think of it like lockdep. We *could* have designed lockdep to simply
give us a nice message whenever we do an a/b b/a deadlock. That would
be helpful. Or, we could design it to record all lock acquisitions that
didn't deadlock to see if they ever possibly deadlock. (We did the
second one, btw). That gave an early, useful, warning that developers
could fix before we encounter an actual problem. I'm advocating such a
mechanism for c/r.
More information about the Containers