[RFC][PATCH 2/2] CR: handle a single task with private memory maps
jruscio at evergrid.com
Mon Aug 4 20:51:37 PDT 2008
On Aug 4, 2008, at 7:37 PM, Oren Laadan wrote:
>>>> The point is that you need previous data when building an
>>>> checkpoint, so you will read it at least. And since it was
>>>> previously stored (in
>>> The scheme that I described above and is implemented in Zap does
>>> not require
>>> access to previous checkpoints when building a new incremental
>>> Instead, you keep some data structure in the kernel that describes
>>> the pieces
>>> that you need to carry with you (what pages were saved, and where;
>>> when a task
>>> exits, the data describing its mm will be discarded, of course,
>>> and so on).
>> This is because you probably decided that a mechanism in the kernel
>> that saves
>> storage space was not interesting if it does not improve speed. As a
>> consequence you need to keep metadata in kernel memory in order to do
>> incremental checkpoint. Maybe saving storage space without
>> speed could equally be done from userspace with sort of checkpoint
>> tools that would create an incremental checkpoint 2' from two full
>> checkpoints 1 and 2.
> Good point. In fact, the meta data is not only kept in memory, but
> also saved
> with each incremental checkpoint (well, its version at checkpoint
> time), so
> that restart would know where to find older data. So it is already
> to user space; we may as well provide the option to keep it only in
> user space.
As somewhat of a tangent to this discussion, I've been giving some
thought to the general strategy we talked about during the summit. The
checkpointing solution we built at Evergrid sits completely in
userspace and is soley focused on checkpointing parallel codes (e.g.
MPI). That approach required us to virtualize a whole slew of
resources (e.g. PIDs) that will be far better supported in the kernel
through this effort. On the other hand, there isn't anything inherent
to checkpointing the memory in a process that requires it to be in a
kernel. During a restart, you can map and load the memory from the
checkpoint file in userspace as easily as in the kernel. Since the
cost of checkpointing HPC codes is fairly dominated by checkpointing
their large memory footprints, memory checkpointing is an area of
ongoing research with many different solutions.
It might be desirable for the checkpointing implementation to be
modular enough that a userspace application or library could select to
handle certain resources on their own. Memory is the primary one that
comes to mind.
More information about the Containers