[PATCH 2/2] Add support for the per-task sem_undo list (v3.01)

Oren Laadan orenl at cs.columbia.edu
Thu Aug 5 15:40:57 PDT 2010



On 08/05/2010 05:48 PM, Matt Helsley wrote:
> On Thu, Aug 05, 2010 at 10:57:06AM -0700, Dan Smith wrote:
>> The semaphore undo list is a set of adjustments to be made to semaphores
>> held by a task on exit.  Right now, we do not checkpoint or restore this
>> list which could cause undesirable behavior by a restarted process on exit.

[...]

>> +{
>> +	int i;
>> +	int ret;
>> +
>> +	for (i = 0; i<  count; i++) {
>> +		struct sem_undo *un;
>> +
>> +		spin_lock(&ulp->lock);
>> +		un = lookup_undo(ulp, semids[i]);
>> +		spin_unlock(&ulp->lock);
>> +
>> +		if (!un) {
>> +			ckpt_debug("unable to lookup semid %i\n", semids[i]);
>
> Or should this be a ckpt_err() ?

This is not supposed to happen: the semdis[] array was
extracted just before, and it should not be possible for
them to disappear (at least in the container-checkpoint
case, where tasks are frozen and ipcns does not leak).

>
>> +			return -EINVAL;
>> +		}
>> +
>> +		ret = checkpoint_sem_undo_adj(ctx, un);
>> +		ckpt_debug("checkpoint_sem_undo: %i\n", ret);
>> +		if (ret<  0)
>> +			return ret;
>> +	}
>> +
>> +	return 0;
>> +}
>
> <snip>
>
>> +static int restore_task_sem_undo_adj(struct ckpt_ctx *ctx)

Now that Matt said it ... I suggest to rename this too to
something like "restore_sem_undo_adj()" :)

(Interestingly, the checkpoint counterpart is named so !)

>> +{
>> +	struct ckpt_hdr_task_sem_undo *h;
>> +	int len;
>> +	int ret = -ENOMEM;
>> +	struct sem_undo *un;
>> +	int nsems;
>> +	__s16 *semadj = NULL;
>> +
>> +	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_TASK_SEM_UNDO);
>> +	if (IS_ERR(h))
>> +		return PTR_ERR(h);
>> +
>> +	len = sizeof(__s16) * h->semadj_count;
>> +	ret = ckpt_read_payload(ctx, (void **)&semadj, len, CKPT_HDR_BUFFER);
>> +	if (ret<  0)
>> +		goto out;
>> +
>
> As best I can tell exit_sem() does not do permission checking -- it
> relies on semop and semtimedop to check permissions on the operation
> that created the undo in the first place. Thus I think we need to check
> permissions here with ipc_check_perms() or ipcperms(). Otherwise it may
> be possible to manipulate semaphores we do not have permission to
> manipulate.

Good catch !

>
>> +	un = find_alloc_undo(current->nsproxy->ipc_ns, h->semid);
>> +	if (IS_ERR(un)) {
>> +		ret = PTR_ERR(un);
>> +		ckpt_debug("unable to find semid %i\n", h->semid);
>> +		goto out;
>> +	}
>> +
>> +	nsems = sem_undo_nsems(un, current->nsproxy->ipc_ns);
>> +	len = sizeof(__s16) * nsems;
>> +	if (h->semadj_count == nsems)
>> +		memcpy(un->semadj, semadj, len);
>> +	rcu_read_unlock();
>> +
>> +	if (nsems != h->semadj_count)
>> +		ckpt_err(ctx, -EINVAL,
>> +			 "semid %i has nmsems=%i but %i undo adjustments\n",
>> +			 h->semid, nsems, h->semadj_count);
>> +	else
>> +		ckpt_debug("semid %i restored with %i adjustments\n",
>> +			   h->semid, h->semadj_count);
>> + out:
>> +	ckpt_hdr_put(ctx, h);
>> +	kfree(semadj);
>> +
>> +	return ret;
>> +}
>> +
>> +static void *restore_sem_undo(struct ckpt_ctx *ctx)
>> +{
>> +	struct ckpt_hdr_task_sem_undo_list *h;
>> +	struct sem_undo_list *ulp = NULL;
>> +	int i;
>> +	int ret = 0;
>> +
>> +	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_TASK_SEM_UNDO_LIST);
>> +	if (IS_ERR(h))
>> +		return ERR_PTR(PTR_ERR(h));
>> +

Maybe add this:
	/*
	 * On success, alloc_undo_list() attaches the new @ulp
	 * to current task - so no need for explicit cleanup
	 */
>> +	ulp = alloc_undo_list(current->nsproxy->ipc_ns);
>> +	if (!ulp) {
>> +		ret = -ENOMEM;
>> +		goto out;
>> +	}
>> +
>> +	for (i = 0; i<  h->count; i++) {
>> +		ret = restore_task_sem_undo_adj(ctx);
>> +		if (ret<  0)
>> +			goto out;
>> +	}
>> +

Maybe add this:

	/* success: account for reference in the objhash*/
>> +	atomic_inc(&ulp->refcnt);
>> + out:
>> +	ckpt_hdr_put(ctx, h);
>> +	if (ret<  0)
>
> Don't we need to check ulp and clean up the alloc'd undo list here?

No: the ulp is attached to current ask (a side effect of
alloc_undo_list) so will be freed when the task exits.
The addition atomic_inc() is only performed on success,
to account for it being added to the hash once we return.

Dan: I suggest to add the comment above to make it clear.

Oren.



More information about the Containers mailing list