[cgl_discussion] Proposal for the implementation of Robust Mu texes (ala Sun's POSI X extension)

Joe DiMartino joe at osdl.org
Fri Mar 14 13:54:37 PST 2003

On Fri, 2003-03-14 at 11:08, Perez-Gonzalez, Inaky wrote:
> > -----Original Message-----
> > From: Joe DiMartino [mailto:joe at osdl.org]
> > 
> > I like the idea to have the mutex remain in EOWNERDEAD state until all
> > have a chance to fix it, however there is a slight snag.
> > 
> > ....
> > 
> > Here are the snags: First, there is no pthread_mutex_inconsistent_np()
> > which will set the state to ENOTRECOVERABLE.  Even if there were such
> That's easy to fix: we add it. We are talking about non-standard compliant
> stuff here, so we are free to add whatever we see fit. I guess it'd be
> interesting to propose this to POSIX after we have an agreed-upon solution,
> though.
> > a call, how would any of the surviving possible owners know that all
> > other such owners have had a go at fixing it?  Imagine a busted mutex
> > with 3 queued requests.  The first gets ownership, can't fix it and
> > lets go (still EOWNERDEAD).  What does it do next - re-queue?  It most
> > likely needs this mutex to complete whatever it's working on.  Whether
> > it re-queues or not, the remaining two queued survivors eventually get
> > their turn to fix it, and if they can't, the final one still doesn't
> > know that everyone else has had a go.  So this mutex will remain forever
> > in the EOWNERDEAD state.
> Sure that's a problem, but I think it is up to the application(s) to
> implement policy to go around it.
> The way I would solve it at the application level would be splitting
> programs into A) cannot fix consistency problems, B) can fix consistency
> problems. I would make it so that only 1 program is in group B.

What would be the interface to choose being in group A or B?

When the scope of this problem was threads within a single process
(i.e., pthreads) it gets hard to imagine that the "fix it" code isn't
available to any thread - especially if they are blocking waiting to
use the resource, there must be knowledge somewhere to set it back to
a valid, working state.

Now that the scope is being expanded, you have to assume there are
communication mechanisms between processes that are external to the
mutex API.  If you already have to assume that group A (can't fix alone)
has a way to contact group B (knows how to fix), couldn't this be done
without changing the semantics at all:

	group A gets lock EOWNERDEAD, contacts group B with a "fix it"
	request (the locks are only advisory anyway, so group B can
	still update what it needs to), and based on B's reply, on
	success A calls pthread_mutex_consistent_np() and on failure
	releases the mutex with implicit ENOTRECOVERABLE.

> Whenever any program in group A finds a consistency problems, signals
> program B and retries the lock for a maximum amount of time/tries, maybe
> waiting for a broadcast from B to actually do it. Then if/when it retries,
> it gets NOTRECOVERABLE then bang, bail out. On EOWNERDEAD, waits and try's
> again until the timeout hits. If normal, keep going ...
> Solution B, more in the line of your suggested scenario, is that a
> fixer-program tries to lock, EOWNERDEAD, tries to fix, fails and passes it
> on. Then it retries, if still EOWNERDEAD, kill it and bail out.

Yes, in this case, the fixer can assume EOWNERDEAD without actually
acquiring the mutex - sort of mutex by proxy from the non-fixer that
currently holds the mutex.

> My point with this system is not that three guys can both try to fix it
> (well, kind of is that too). My point is it gives you flexibility to
> implement any type of solution, be it Sun's or a more elaborate one without
> being limited.

Flexibility is good. :-)  but a solution still exists without changing
the semantics.

> Do you agree with this?

So, there are 2 choices: assume the application will handle it outside
the context of this API and use standard semantics, or extend the
semantics _and_ the API to include some recovery helpers.

-- Joe DiMartino <joe at osdl.org>

More information about the cgl_discussion mailing list