[cgl_discussion] Proposal for the implementation of Robust Mu texes (ala Sun's POSI X extension)

Perez-Gonzalez, Inaky inaky.perez-gonzalez at intel.com
Fri Mar 14 14:43:23 PST 2003

> > > a call, how would any of the surviving possible owners know that all
> > > other such owners have had a go at fixing it?  Imagine a busted mutex
> > > with 3 queued requests.  The first gets ownership, can't fix it and
> > > lets go (still EOWNERDEAD).  What does it do next - re-queue?  It most
> > > likely needs this mutex to complete whatever it's working on.  Whether
> > > it re-queues or not, the remaining two queued survivors eventually get
> > > their turn to fix it, and if they can't, the final one still doesn't
> > > know that everyone else has had a go.  So this mutex will remain
> > > forever in the EOWNERDEAD state.
> >
> > Sure that's a problem, but I think it is up to the application(s) to
> > implement policy to go around it.
> >
> > The way I would solve it at the application level would be splitting
> > programs into A) cannot fix consistency problems, B) can fix consistency
> > problems. I would make it so that only 1 program is in group B.
> What would be the interface to choose being in group A or B?

Ok, this is for my proposed interface
pthread_mutex_{consistent,not_recoverable}(). The ones in group A are the
ones that are "dumb" so to speak - they just see EOWNERDEAD, cry for help,
wait and if they see ENOTRECOVERABLE, they bail out. The ones in group B try
to fix, and call either pthread_mutex_{consistent,not_recoverable}().

> When the scope of this problem was threads within a single process
> (i.e., pthreads) it gets hard to imagine that the "fix it" code isn't
> available to any thread - especially if they are blocking waiting to
> use the resource, there must be knowledge somewhere to set it back to
> a valid, working state.

Sure, but this is the easy case, a single case that has an "easy" solution. 

> Now that the scope is being expanded, you have to assume there are
> communication mechanisms between processes that are external to the
> mutex API.  If you already have to assume that group A (can't fix alone)
> has a way to contact group B (knows how to fix), couldn't this be done
> without changing the semantics at all:
> 	group A gets lock EOWNERDEAD, contacts group B with a "fix it"
> 	request (the locks are only advisory anyway, so group B can
> 	still update what it needs to), and based on B's reply, on
> 	success A calls pthread_mutex_consistent_np() and on failure
> 	releases the mutex with implicit ENOTRECOVERABLE.

That's a way to do it, but I would not use it, because I don't want an
external communication mechanism to make things possibly more complex, I
want the one I have already (the mutex itself).  

I'd do group A sees EOWNERDEAD, kills some recovery thread/process R with
some signal number who immediately claims the lock; everybody passes it
through until R gets it. R does the thing and unlocks it. Now the group A
tasks are reclaiming every X time until they either see LOCK or
NOTRECOVERABLE, for example, or they wait for a broadcast (as you mentioned)
or who knows.

Again, what I want is flexibility to do this or to use Sun's approach if you
desire it. 
> > Whenever any program in group A finds a consistency problems, signals
> > program B and retries the lock for a maximum amount of time/tries, maybe
> > waiting for a broadcast from B to actually do it. Then if/when it
> > retries, it gets NOTRECOVERABLE then bang, bail out. On EOWNERDEAD,
> > waits and try's again until the timeout hits. If normal, keep going ...
> >
> > Solution B, more in the line of your suggested scenario, is that a
> > fixer-program tries to lock, EOWNERDEAD, tries to fix, fails and passes
> > it on. Then it retries, if still EOWNERDEAD, kill it and bail out.
> Yes, in this case, the fixer can assume EOWNERDEAD without actually
> acquiring the mutex - sort of mutex by proxy from the non-fixer that
> currently holds the mutex.

That's is another way to do it.

> > My point with this system is not that three guys can both try to fix it
> > (well, kind of is that too). My point is it gives you flexibility to
> > implement any type of solution, be it Sun's or a more elaborate one
> > without being limited.
> Flexibility is good. :-)  but a solution still exists without changing
> the semantics.

That forces you to have some kind of policy or behaviour, while it can be
also done more flexible, transparently behaving as you want it or not. 

Think about it this way: for Sun's way, you need two dead states: OWNERDEAD
and NOTRECOVERABLE; like it or not. For my suggestion, you need those same
two states, being the only difference that you can control when to
transition from DEAD to not RECOVERABLE instead of being forced to ... hey,
it comes for free!

> > Do you agree with this?
> So, there are 2 choices: assume the application will handle it outside
> the context of this API and use standard semantics, or extend the
> semantics _and_ the API to include some recovery helpers.

That's it. No big deal. If you want Sun's traditional (not a standard,
though, de-facto standard, maybe), tell so to the library (or don't say it,
depending on what's the default). You want the other thing, welcome, here
you are.

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

More information about the cgl_discussion mailing list