[PATCH 2/2] Ensures correct concurrent opening/reading of pidlists across pid namespaces

Andrew Morton akpm at linux-foundation.org
Thu Jul 2 16:54:13 PDT 2009


On Thu, 02 Jul 2009 16:26:25 -0700
Paul Menage <menage at google.com> wrote:

> Ensures correct concurrent opening/reading of pidlists across pid namespaces
> 
> Previously there was the problem in which two processes from different pid
> namespaces reading the tasks or procs file could result in one process seeing
> results from the other's namespace. Rather than one pidlist for each file in a
> cgroup, we now keep a list of pidlists keyed by namespace and file type (tasks
> versus procs) in which entries are placed on demand. Each pidlist has its own
> lock, and that the pidlists themselves are passed around in the seq_file's
> private pointer means we don't have to touch the cgroup or its master list
> except when creating and destroying entries.
> 
> Signed-off-by: Ben Blum <bblum at google.com>
> Reviewed-by: Paul Menage <menage at google.com>
> Signed-off-by: Paul Menage <menage at google.com>

The way these patches were sent states that you were their primary
author.  Is that accurate?  If not, they should have had

From: Ben Blum <bblum at google.com>

at the very top of the changelog.

>
> ...
>
>  /**
> + * find the appropriate pidlist for our purpose (given procs vs tasks)
> + * returns with the lock on that pidlist already held, and takes care
> + * of the use count, or returns NULL with no locks held if we're out of
> + * memory.
> + */

Comment purports to be kerneldoc, but isn't.

> +static struct cgroup_pidlist *cgroup_pidlist_find(struct cgroup *cgrp,
> +						  enum cgroup_filetype type)
> +{
> +	struct cgroup_pidlist *l;
> +	/* don't need task_nsproxy() if we're looking at ourself */
> +	struct pid_namespace *ns = get_pid_ns(current->nsproxy->pid_ns);
> +	mutex_lock(&cgrp->pidlist_mutex);
> +	list_for_each_entry(l, &cgrp->pidlists, links) {
> +		if (l->key.type == type && l->key.ns == ns) {
> +			/* found a matching list - drop the extra refcount */
> +			put_pid_ns(ns);
> +			/* make sure l doesn't vanish out from under us */

This looks fishy.

> +			down_write(&l->mutex);
> +			mutex_unlock(&cgrp->pidlist_mutex);
> +			l->use_count++;
> +			return l;

The caller of cgroup_pidlist_find() must ensure that l->use_count > 0,
otherwise cgroup_pidlist_find() cannot safely use `l' - it could be
freed at any time.  But if l->use_count > 0, there is no risk of `l'
"vanishing out from under us".

I'm probably wrong there, but that's the usual pattern and this code
looks like it's doing something different.  Please check?

> +		}
> +	}
> +	/* entry not found; create a new one */
> +	l = kmalloc(sizeof(struct cgroup_pidlist), GFP_KERNEL);
> +	if (!l) {
> +		mutex_unlock(&cgrp->pidlist_mutex);
> +		return l;
> +	}
> +	init_rwsem(&l->mutex);
> +	down_write(&l->mutex);
> +	l->key.type = type;
> +	l->key.ns = ns;
> +	l->use_count = 0; /* don't increment here */
> +	l->list = NULL;
> +	l->owner = cgrp;
> +	list_add(&l->links, &cgrp->pidlists);
> +	mutex_unlock(&cgrp->pidlist_mutex);
> +	return l;
> +}
> +
>
> ...
>



More information about the Containers mailing list