[PATCH review 3/8] dcache: Clearly separate the two directory rename cases in d_splice_alias

Al Viro viro at ZenIV.linux.org.uk
Sat Aug 15 06:16:17 UTC 2015

On Thu, Aug 13, 2015 at 11:31:47PM -0500, Eric W. Biederman wrote:

> The problem is that as the lookup locking stands today grabbing the
> s_vfs_rename_mutex must use mutex_trylock which can fail, so for reliability
> reasons we need to avoid using the rename_mutex as much as possible.

I really don't like it.  For one thing, you *still* are taking it - have to,
really.  So this argument is moot anyway.  ESTALE can happen here.  For
another, I'm not convinced that this "we don't need no stinkin'
extra locks for attaching a detached subtree" is correct.  E.g. what's
to protect that IS_ROOT(new) from changing right under you?

Consider a corrupted filesystem (or a bogus server, or a sufficiently
unpleasant race with another client, etc.)

You have a detached subtree with lookups from *TWO* directories trying to
attach its root.  Now what?  ->i_mutex on either prospective parent won't
give a damn thing - different inodes.  The current tree would've serialized
those on rename_lock and treated that as attach a detached followed by
relocate a misplaced.  With your change we get a race in there.

This place is subtle and nasty, and we had rather nasty races there.
Repeatedly.  Any non-trivial locking changes in that area should go
separately from everything else and only with an accurate analysis of
those changes.  It's one of the easiest places to fuck up in.  Been
there, done that, and so had many other people.

I'm not saying that it wouldn't benefit from cleaner locking - it sure
as hell would.  But it's in a really incestous relationship with a lot
of other pieces, both in fs/dcache.c and elsewhere.  Let's not mix that
into anything else - driveby cleanups in that place are very likely to
cause serious trouble.

More information about the Containers mailing list