[cgl_discussion] POSIX requirements question for TEMs/ISVs/OEMs - Robust Mutex Support
Howell, David P
david.p.howell at intel.com
Tue Feb 18 15:02:53 PST 2003
I am working to provide a definition for the robust mutexes feature that
to NGPT. This was done to satisfy a CGL requirement from a member
company. It provides
new POSIX functionality for robust use of shared mutexes, using a POSIX
My aim with this posting is to get feedback from TEMs, OEMs, and ISVs on
for this feature. If it is something needed by telecom space
applications I will go
forward and propose it to the Austin Group for inclusion in future POSIX
otherwise we can drop it.
A quick sketch of the implementation:
Robust mutex extensions to POSIX
Robust mutexes support permits a mutex to synchronize threads between
effectively, even when processes exit or abort unexpectedly. This gives
a POSIX implementation
of the Solaris robust mutexes implementation.
- A robust mutex is initialized with the robust mutex attribute. It must
be an inter process shared
mutex, allocated in a shared memory segment mapped into the processes
that use it.
- Each process that uses the robust mutex must call pthread_mutex_init
at least once to
register the robust mutex with the system and initialize it. If
pthread_mutex_init is called
on a previously initialized mutex it will not re-initialize the mutex.
- When a process dies, if the robust mutex is held by the process, it
- The next thread that attempts to lock the mutex will acquire it, but
with an return value from
pthread_mutex_lock/pthread_mutex_trylock of EOWNERDEAD instead of
success. If the process
which got the lock with EOWNERDEAD died, the next locker will get the
mutex with an error
return value of EOWNERDEAD as well.
- Applications using a robust mutex must always check the return code
from these APIs to
see if this occurred, and if it did should attempt to make the state
protected by the mutex
consistent, since this state could have been left inconsistent when
the last owner died.
- If the new robust mutex owner is able to make the state consistent, it
should re-initialize the
mutex and then unlock it. If it can't make the state consistent, for
whatever reason, it should
not re-initialize the mutex, but should just unlock it; All subsequent
calls to lock the mutex
will fail with the error return ENOTRECOVERABLE.
- The robust mutex can be made consistent again by un-initializing the
mutex and reinitializing it.
NGPT Implementation details, not sure how useful for requirements, but
it does exist:
- POSIX APIs affected:
- pthread_mutexattr_setrobust_np - New API for setting
- pthread_mutexattr_getrobust_np - New API for getting the robust
- pthread_mutex_consistent_np - New API to make a non consistent
robust mutex consistent
- pthread_mutex_init - Handling for robust attribute
- pthread_mutex_lock - Lock time handling of robust attribute
- pthread_mutex_trylock - Lock time handling of robust attribute
(common with pthread_mutex_lock)
- pthread.h - Addition of PTHREAD_MUTEX_ROBUST_NP
Some notes from earlier discussions:
- The existing implementations that have a thread feature like this is
Solaris thread APIs (not POSIX). See Solaris mutex_init,
Attribute for details of how it's done with this implementation.
- NGPT's implementation uses extensions to the existing POSIX thread
- Without a POSIX standard initiative/tie-in this likely isn't going
other open source pthreads implementations (like NPTL).
- This seems to be critical functionality for the use of shared mutexes,
key enough for Solaris to adopt and support it and key enough for it
be a requirement that was implemented for NGPT.
- There is no other POSIX way to do this without accessing the internals
the pthread_mutex_t structures and hacking other internal parts of
POSIX thread APIs, i.e. to find out a shared mutex owner to test if it
alive or dead, to allow a application restart to regain access to
shared mutexes, and to check and correct for shared mutex consistency.
- It does provide a valuable service for hardened/restartable
that use shared mutexes to synchronize between process groups.
- It would be a while (Linux 2.6, best case) before it would be in a
of the POSIX specifications that gets ratified. So, this is beyond CGL
It'd be really nice if this could be passed down to architects/engineers
are familiar with the APPs that are provided to comment on this. Any
very much appreciated.
These are my opinions and not official opinions of Intel Corp.
Telco Server Development
Server Products Division
Voice: (803) 461-6112 Fax: (803) 461-6292
Columbia Design Center, CBA-2
250 Berryhill Road, Suite 100
Columbia, SC 29210
david.p.howell at intel.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cgl_discussion