[cgl_discussion] POSIX requirements question for TEMs/ISVs/OEMs - Robust Mutex Support

Pradeep Kathail pkathail at cisco.com
Tue Feb 18 16:08:06 PST 2003


Hi David,

We need such a support for shared memory Mutexes. I am collecting feedback
from internal groups and pass it back to you.

Brgds.
Pradeep

At 2/18/2003 03:02 PM -0800, Howell, David P wrote:

>Hello,
>
>I am working to provide a definition for the robust mutexes feature that was added 
>
>to NGPT. This was done to satisfy a CGL requirement from a member company. It provides
>
>new POSIX functionality for robust use of shared mutexes, using a POSIX framework. 
>
> 
>
>My aim with this posting is to get feedback from TEMs, OEMs, and ISVs on their need 
>
>for this feature. If it is something needed by telecom space applications I will go 
>
>forward and propose it to the Austin Group for inclusion in future POSIX specifications,
>
>otherwise we can drop it. 
>
> 
>
>A quick sketch of the implementation:
>
> 
>
>Robust mutex extensions to POSIX
>
>            ---------------------------------------------------
>
>Robust mutexes support permits a mutex to synchronize threads between processes groups 
>
>effectively, even when processes exit or abort unexpectedly. This gives a POSIX implementation
>
>of the Solaris robust mutexes implementation.
>
> 
>
>Characteristics:
>
>- A robust mutex is initialized with the robust mutex attribute. It must be an inter process shared 
>
>  mutex, allocated in a shared memory segment mapped into the processes that use it.
>
>- Each process that uses the robust mutex must call pthread_mutex_init at least once to 
>
>  register the robust mutex with the system and initialize it. If pthread_mutex_init is called 
>
>  on a previously initialized mutex it will not re-initialize the mutex.
>
>- When a process dies, if the robust mutex is held by the process, it gets unlocked. 
>
>- The next thread that attempts to lock the mutex will acquire it, but with an return value from 
>
>  pthread_mutex_lock/pthread_mutex_trylock of EOWNERDEAD instead of success. If the process
>
>   which got the lock with EOWNERDEAD died, the next locker will get the mutex with an error 
>
>   return value of EOWNERDEAD as well.
>
>- Applications using a robust mutex must always check the return code from these APIs to 
>
>  see if this occurred, and if it did should attempt to make the state protected by the mutex 
>
>  consistent, since this state could have been left inconsistent when the last owner died. 
>
>- If the new robust mutex owner is able to make the state consistent, it should re-initialize the 
>
>  mutex and then unlock it. If it can t make the state consistent, for whatever reason, it should 
>
>  not re-initialize the mutex, but should just unlock it; All subsequent calls to lock the mutex 
>
>  will fail with the error return ENOTRECOVERABLE.
>
>- The robust mutex can be made consistent again by un-initializing the mutex and reinitializing it.
>
> 
>
>NGPT Implementation details, not sure how useful for requirements, but it does exist:
>
>- POSIX APIs affected:
>
>      - pthread_mutexattr_setrobust_np - New API for setting PTHREAD_MUTEXATTR_ROBUST attribute
>
>      - pthread_mutexattr_getrobust_np - New API for getting the robust attribute
>
>      - pthread_mutex_consistent_np - New API to make a non consistent robust mutex consistent
>
>      - pthread_mutex_init - Handling for robust attribute
>
>      - pthread_mutex_lock - Lock time handling of robust attribute
>
>      - pthread_mutex_trylock - Lock time handling of robust attribute (common with pthread_mutex_lock)
>
>      - pthread.h - Addition of PTHREAD_MUTEX_ROBUST_NP
>
> 
>
> 
>
>Some notes from earlier discussions:
>
>- The existing implementations that have a thread feature like this is in the 
>
>  Solaris thread APIs (not POSIX). See Solaris mutex_init, USYNC_PROCESS_ROBUST
>
>  Attribute for details of how it s done with this implementation.
>
>- NGPT s implementation uses extensions to the existing POSIX thread APIs.
>
>- Without a POSIX standard initiative/tie-in this likely isn't going anywhere in 
>
>  other open source pthreads implementations (like NPTL).
>
>- This seems to be critical functionality for the use of shared mutexes, 
>
>  key enough for Solaris to adopt and support it and key enough for it to 
>
>  be a requirement that was implemented for NGPT.
>
>- There is no other POSIX way to do this without accessing the internals of 
>
>  the pthread_mutex_t structures and hacking other internal parts of several
>
>  POSIX thread APIs, i.e. to find out a shared mutex owner to test if it is 
>
>  alive or dead, to allow a application restart to regain access to existing
>
>  shared mutexes, and to check and correct for shared mutex consistency. 
>
>- It does provide a valuable service for hardened/restartable applications 
>
>  that use shared mutexes to synchronize between process groups.
>
>- It would be a while (Linux 2.6, best case) before it would be in a draft
>
>  of the POSIX specifications that gets ratified. So, this is beyond CGL 2.0
>
>  timeframe.  
>
> 
>
>It'd be really nice if this could be passed down to architects/engineers who
>
>are familiar with the APPs that are provided to comment on this. Any help is 
>
>very much appreciated.
>
> 
>
>Thanks,
>
>Dave Howell
>
> 
>
> 
>
>These are my opinions and not official opinions of Intel Corp.
>
> 
>
>David Howell
>
>Intel Corporation
>
>Telco Server Development
>
>Server Products Division
>
>Voice: (803) 461-6112  Fax: (803) 461-6292
>
> 
>
>Intel Corporation
>
>Columbia Design Center, CBA-2
>
>250 Berryhill Road, Suite 100
>
>Columbia, SC 29210
>
> 
>
>david.p.howell at intel.com
>
> 




More information about the cgl_discussion mailing list