[RFC v2][PATCH 2/9] General infrastructure for checkpoint restart

Oren Laadan orenl at cs.columbia.edu
Sat Aug 23 22:58:36 PDT 2008



Louis Rilling wrote:
> On Wed, Aug 20, 2008 at 11:04:13PM -0400, Oren Laadan wrote:
>> Add those interfaces, as well as helpers needed to easily manage the
>> file format. The code is roughly broken out as follows:
>>
>> ckpt/sys.c - user/kernel data transfer, as well as setup of the
>> checkpoint/restart context (a per-checkpoint data structure for
>> housekeeping)
>>
>> ckpt/checkpoint.c - output wrappers and basic checkpoint handling
>>
>> ckpt/restart.c - input wrappers and basic restart handling
>>
>> Patches to add the per-architecture support as well as the actual
>> work to do the memory checkpoint follow in subsequent patches.
>>
> 
> [...]
> 
>> diff --git a/checkpoint/sys.c b/checkpoint/sys.c
>> new file mode 100644
>> index 0000000..2891c48
>> --- /dev/null
>> +++ b/checkpoint/sys.c
> 
> [...]
> 
>> +/*
>> + * helpers to manage CR contexts: allocated for each checkpoint and/or
>> + * restart operation, and persists until the operation is completed.
>> + */
>> +
>> +static atomic_t cr_ctx_count;	/* unique checkpoint identifier */
> 
> I thought we agreed that this counter should be per-container. Perhaps add a
> TODO here?

True.

> 
>> +
>> +void cr_ctx_free(struct cr_ctx *ctx)
>> +{
>> +
>> +	if (ctx->file)
>> +		fput(ctx->file);
>> +	if (ctx->vfsroot)
>> +		path_put(ctx->vfsroot);
>> +
>> +	free_pages((unsigned long) ctx->tbuf, CR_TBUF_ORDER);
>> +	free_pages((unsigned long) ctx->hbuf, CR_HBUF_ORDER);
>> +
>> +	kfree(ctx);
>> +}
>> +
>> +struct cr_ctx *cr_ctx_alloc(pid_t pid, struct file *file, unsigned long flags)
>> +{
>> +	struct cr_ctx *ctx;
>> +
>> +	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
>> +	if (!ctx)
>> +		return NULL;
>> +
>> +	ctx->tbuf = (void *) __get_free_pages(GFP_KERNEL, CR_TBUF_ORDER);
>> +	ctx->hbuf = (void *) __get_free_pages(GFP_KERNEL, CR_HBUF_ORDER);
>> +	if (!ctx->tbuf || !ctx->hbuf)
>> +		goto nomem;
>> +
>> +	ctx->pid = pid;
>> +	ctx->flags = flags;
>> +
>> +	ctx->file = file;
>> +	get_file(file);
>> +
>> +	/* assume checkpointer is in container's root vfs */
> 
> I'm a bit puzzled by this assumption. I would say: either this is a
> self-checkpoint (only current process), or this is a container checkpoint. In
> the latter case, I expect that in the general case the checkpointer lives
> outside the container (and the interface of sys_checkpoint() below confirms
> this), so it's root fs is probably not the container's one.
> 
> Does it differ from what you're planning?

You are correct. We lack infrastructure for what I'd call "container-object",
and the patchset does not yet tackle a container and multiple tasks, so this
is an interim solution. Will add a FIXME comment.

Thanks,

Oren.



More information about the Containers mailing list