[cgl_discussion] RE: [PATCH] Robust and Real-time Mutexes in NPTL

Perez-Gonzalez, Inaky inaky.perez-gonzalez at intel.com
Mon Jul 14 12:29:05 PDT 2003

Hi Ulrich

> From: Ulrich Drepper [mailto:drepper at redhat.com]
> > This is a ---prototype implementation--- that extends NPTL with
> > real-time and robust mutex features.
> I looked over the code now and can make a few comments.
> This functionality clearly is useful outside the RT arena.  Therefore
> the kernel side should not be mixed with the rtfutex code.  You
> definitely should split this functionality out, make the robustness
> patch a first, smaller patch.

This is kind of difficult, as the key sharing between the robust and
real-time stuff is the owner-identification property. However, I agree
with you--this is more a design failure on my side than anything else.

The design failure is trying to add the concept of ownership to some-
thing that does not have it or allow it (futexes). The more I look
into it, the more convinced I am of it. 

As well, many other aspects of the real-time futexes are needed for
robustness to succeed. For example, the kernel setting the value of
the futex when passing ownership instead of the user-space.
> As for the RT futex stuff.  It looks awfully complicated.  Not that the
> topic isn't complicated but I would be surprised if you would get the
> code past Linus.

Agree--that is almost the ugliest code I ever wrote, but is ok
because that code is far from ready to go for integration. If 
other, it has given many of the indications and clues for how to
really do it, so the next design should solve many of those issues.

>  The integration of the RT handling in nptl itself is
> not acceptable for the official version.  You can have a separate
> libpthread.so binary with the RT behavior but that's it.  The question
> is how will this work with inter-process sync objects where one side
> uses RT, the other non-RT.  It is possible to declare this a user

I have found a better way to do it; the objects are all the same and
the only thing that changes is how do you perform the unlock operation,
serialized or unserialized (that means the kernel sets the next owner
or the unlocker sets the mutex to unlocked and wakes up one). 

It is feasible; the only drawback is that robustness is half as 
robust in unserialized mode. The little evaluations I have been 
able to make in test cases give no clear winner on who is more 
effective (serialized vs unserialized), but on synthetic 
benchmarks, unserialized wins.

Of course, by having a bit toggle in the shared data structure, it 
can be easily determined what unlock method to use by default,
so that should not be a problem even on shared objects.

> problem but it can easily happen.  For mutexes it might be possible to
> integrate the changes in libpthread itself.  Not the way you've done it,
> but instead by using yet another kind of mutex.  This way regular users
> are not disturbed and punished by the RT code.  For rwlocks I don't see

The only reduction in performance I have seen by using RT locks has
been in synthetic benchmarks (I admit I haven't done extensive 
testing), but it was kind of expected because of the way the
scheduling was done; example:

static pthread_mutex_t mutex;
volatile int progress[];

thread_fn (void *_index)
  int index = (int) _index;
  while (true){
    pthread_mutex_lock (&mutex);
    pthread_mutex_unlock (&mutex);

When you fire up a few of these at the same time, it goes way
faster in NPTL than in NPTL with RT extensions--the main reason
is the way the unlock is done (unserialized vs. serialized). 
On NPTL, the current mutex owner unlocks the value of the futex,
wakes up one and then goes back to lock again; as he is cache
hot, he is able to acquire right in that minute the futex through
the fast path. The woken up thread will bounce back to the kernel.

This keeps on until it expires its timeslice, and then next
thread will do the same.

OTOH, NPTL+RT will enforce strict serialization of access to
the mutex--a context switch per unlock with waiters on the 
mutex; it shatters the cache, of course, and progress
is way slower.

But as I said, this is so much of a synthetic benchmark that makes
little sense. Real life code doesn't have this kind of behaviour
(AFAIK), so that's why I am not worrying too much about unserialized
wake ups.

> this happening.  Maybe defining a completely separate
> pthread_rt_rwlock_t type and set of functions?  Or not handling RT for
> rwlocks at all.  POSIX doesn't say anything about such beasts anyway.

Some people is interested in at least robust behavior in rwlocks,
and then, RT stuff--let's say they are the same that are interested
in the robust mutexes. There is a lot to experiment here.

> condvars are a special case.  Only the pthread_cond_signal code would
> need handling.  Maybe some more kernel support is needed, too.  What

Not really - the RT handling is simple. The problem here is how to
tie robustness correctly without making the broadcast code even 

> And of course Intel's pet, sched_yield.  The implementation in the patch
> is plainly wrong.  sched_yield yields the processor, for the _entire_
> process, not just the current thread.  Get over it and fix your code.

Says who? Unless I got anything in the code wrong (and if so, feel
free to bug me about it), __sched_yield() forces a task structure 
to turn the CPU over, not a whole process. 

It is similar, but not the same.

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own (and my fault)

More information about the cgl_discussion mailing list