[PATCH review 11/11] mnt: Honor MNT_LOCKED when detaching mounts

Eric W. Biederman ebiederm at xmission.com
Fri Jan 16 18:29:39 UTC 2015


Al Viro <viro at ZenIV.linux.org.uk> writes:

> On Sat, Jan 10, 2015 at 05:51:48AM +0000, Al Viro wrote:
>> On Fri, Jan 09, 2015 at 11:32:47PM -0600, Eric W. Biederman wrote:
>> 
>> > I don't believe rcu anything in this function itself buys you anything,
>> > but structuring this primitive so that it can be called from an rcu list
>> > traversal seems interesting.  
>> 
>> ???
>> 
>> Without RCU, what would prevent it being freed right under us?
>> 
>> The whole point is to avoid pinning it down - as it is, we can have
>> several processes call ->kill() on the same object.  The first one
>> would end up doing cleanup, the rest would wait *without* *affecting*
>> *fs_pin* *lifetime*.
>> 
>> Note that I'm using autoremove there for wait.func(), then in the wait
>> loop I check (without locks) wait.task_list being empty.  It is racy;
>> deliberately so.  All I really care about in there is checking that
>> wait.func has not been called until after rcu_read_lock().  If that is
>> true, we know that p->wait hadn't been woken until that point, i.e.
>> p hadn't reached rcu delay on the way to being freed until after our
>> rcu_read_lock().  Ergo, it can't get freed until we do rcu_read_unlock()
>> and we can safely take p->wait.lock.
>> 
>> RCU is very much relevant there.
>
> FWIW, I've just pushed a completely untested tree in #experimental-fs_pin;
> it definitely will be reordered, etc., probably with quite a few of the
> patches from the beginning of your series mixed in, but the current tree
> in there should show at least what I'm aiming at.

I have merged the work you have been doing and what I have been doing
and posted it to a branch #for-testing of my user-namespace.git tree.

And yes I managed to make the core of the pin primitive not care about
rcu, and I think I will need that property to clean up some of the
weirdness that I still see with using fs_pin.

pin_insert does not wind up being a clean primitive, adding to both
lists at the same time does not end up with particularly clean or
obvious locking rules or a clean locking impelementation.  

Still the code works and is a good starting point for further discussion
and thinking.  I am posting the code while I go off to see if I can spot
better ways to clean some of these things up.

Eric



More information about the Containers mailing list