[Openais] Re: FW: Evt Deadlock

Steven Dake sdake at mvista.com
Tue Jan 24 14:13:18 PST 2006


On Tue, 2006-01-24 at 15:42 -0600, Muni Bajpai wrote:
> So I think the basic premise of what I saw was that In the HandlePut
> call, the thread in question was holding the channel_handle_db lock and
> requesting the event_handle_db lock. So for deadlock to happen another
> thread has to hold at least the event_handle_db lock and requesting the
> channel_handle_db lock. 
> 
> I'm not sure if any path in make_event can fulfill the above criteria
> (0.70 code) to cause a lock but then again I might be missing something
> 
> Thanks
> 

Muni

The mutexes are not held for long periods in the library.  Instead we
use reference counting to avoid the need to hold these locks for long
periods and provide better concurrency in multi-threaded apps.

The scenario you describe cannot happen.  The call code is not:
request channel handle db lock
request event handle db lock
do some operation on the event data
release event handle db lock
release channel handle lock

instead with the reference counting code it is always:
request channel handle db lock
increase ref count on channel handle db 
release channel handle db lock
decrease ref count on channel handle db
request event handle db lock
increase ref count on event handle db 
release event handle db lock
decrease ref count on event handle db

I think more likely there is a bug (like the one mark fixed) with
saHandleDestroy being called on a hdb with an invalid handle.  This is
also totally consistent with the previous segfault we saw where the
handle was improperly passed.

I suspect Mark's fix should fix this problem, or also upgrading to
include defect 1029.

Regards
-steve

> Muni
> 
> -----Original Message-----
> From: Mark Haverkamp [mailto:markh at osdl.org] 
> Sent: Tuesday, January 24, 2006 3:17 PM
> To: sdake at mvista.com
> Cc: Bajpai, Muni [RICH1:B670:EXCH]; openais at lists.osdl.org;
> scd at broked.org
> Subject: Re: [Openais] Re: FW: Evt Deadlock
> 
> On Tue, 2006-01-24 at 12:27 -0700, Steven Dake wrote:
> [ ... ]
> 
> > 
> > Mark, I'd take a second look at your saHandleDestroy calls as they may
> > have some kind of problem.
> 
> OK, I found a bad bug.  I don't know if it is related to anyones
> trouble, but it is bad.  In make_event (creates an event structure in
> the library code) the new event is destroyed if there are any errors.  I
> was using the wrong handle database to destroy the event handle.  Here
> is the patch.  This will need to be checked into the picacho branch too.
> I'll create a bugzilla entry too.
> 
> Mark.
> 
> 




More information about the Openais mailing list