[cgl_discussion] Proposal for the implementation of Robust Mu texes (ala Sun's POSI X extension)

Perez-Gonzalez, Inaky inaky.perez-gonzalez at intel.com
Fri Mar 14 15:58:25 PST 2003

> >  The ones in group A are the
> > ones that are "dumb" so to speak - they just see EOWNERDEAD, cry for
> help,
> > wait and if they see ENOTRECOVERABLE, they bail out. The ones in group B
> try
> > to fix, and call either pthread_mutex_{consistent,not_recoverable}().
> What is the interface to "cry for help"?  Should it be an extension of
> this API.  If so, how do processes/threads register as "I can fix it"?

That's up to the application - we don't need to get into there as there are
so many combinations that calculating would give a headache to a computer
... [interesting irony, btw :)]

> > I'd do group A sees EOWNERDEAD, kills some recovery thread/process R
> with
> > some signal number who immediately claims the lock; everybody passes it
> > through until R gets it. R does the thing and unlocks it. Now the group
> A
> > tasks are reclaiming every X time until they either see LOCK or
> > NOTRECOVERABLE, for example, or they wait for a broadcast (as you
> mentioned)
> > or who knows.
> What if R is the one that died?

The designer would need to take that case into account, and in case that
happened, have a watchdog restart R if it is the only "fixer" available. If
it fails too much ... well, there is a bigger problem to solve now, much
more urgent than who's able to recover the mutex :)

> Maybe there can be some maximum number of EOWNERDEAD returns before the
> mutex is automatically converted to NOTRECOVERABLE?

Sure, that is another way, and that can be done by the application following
it's own policy as it best sees fit.

> > > > Do you agree with this?
> > >
> > > So, there are 2 choices: assume the application will handle it outside
> > > the context of this API and use standard semantics, or extend the
> > > semantics _and_ the API to include some recovery helpers.
> >
> > That's it. No big deal. If you want Sun's traditional (not a standard,
> > though, de-facto standard, maybe), tell so to the library (or don't say
> > it, depending on what's the default). You want the other thing, welcome,
> > here you are.
> That's a great compromise. :-)  Have it default to the Sun traditional
> way unless some library option is specified to get the extensions.

OK, everybody's happy then ... I have been modifying the kernel code to do
this two-state thingie; it wasn't as painful as I feared - time to write a
few tests while the kernel compiles :)

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

More information about the cgl_discussion mailing list