[Openais] Re: ipc rewrite

Mark Haverkamp markh at osdl.org
Wed Apr 26 10:02:16 PDT 2006


On Tue, 2006-04-25 at 16:16 -0700, Steven Dake wrote:
> The events dropped might be because of priority inversion of the
> subscription and publish tests.  They should be set to sched-rr:1.  Look
> at evsbench.  Eventually this will be resolved in a later patch so that
> priorities are automatically determined.  Let me know what tests you are
> running to get the "lockup" and I'll see what is wrong with the ipc.

OK, I'll check this out.

> 
> evsbench seems to work properly which is the only way I tested this..
> 
> What was the test case for the double free?

Same tests in all instances.  I run these two programs on each of my
four nodes:

publish -t0 -x2000 -w2 -f2
subscription -q -q -f2

The publish program writes 2000 events with a zero retention time, waits
two seconds and starts again.  The -f2 says to wait two seconds if
aisexec goes away and then try to connect again.

The subscription program subscribes to the event channel that the
publish program is sending events to and prints a "." for every 1000
events received.


> 
> With the new code, it will be difficult to run aisexec within gdb
> because the ipc code will often call pthread_kill to interrupt the poll
> when the outbound kernel queue is full (this interrupts gdb too sigh).
> I'd recommend ulimit -c unlimited to create core files and then use
> gdb ./aisexec corefile
> 
> you can use thread 1, thread 2, etc to get to different threads and get
> backtraces.
> 
> I realize this adds extra complication for the developers but it should
> pay off in the end.
> 
> Regards
> -steve
> On Tue, 2006-04-25 at 15:07 -0700, Mark Haverkamp wrote:
> > On Tue, 2006-04-25 at 13:45 -0700, Steven Dake wrote:
> > > Here is an IPC rewrite I've been working on for awhile.  It offers
> > > 20-40% performance improvement on my systems.  It will someday allow
> > > realtime support without priority inversions that other sa forum
> > > implementations suffer.  Please try it out.  This should also fix the
> > > double free mess with the current code.
> > > 
> > > Mark H you will need to call hdb_create/hdb_destroy on your esi_hdb data
> > > structure somewhere.  I wasn't sure where it was appropriate to call it.
> > > Could you take a look and add that code?
> > 
> > I applied the patch and put the create/destroy where I thought that it
> > should go in the event code.  I put the create in the lib_init_fn and
> > the destroy in the lib_exit_fn.
> > 
> > I have attached a patch for the event changes.  
> > 
> > I am not having much luck with it though.  I get events dropped fairly
> > quickly in the subscription program and for some reason, I'm not seeing
> > them re-start again.  If I kill the subscription test and re-start it I
> > see events.  oops, while I was typing, one of the aisexec programs got a
> > glibc double free corruption abort.
> > 
> > 
> > > 
> > > Thanks
> > > -steve
-- 
Mark Haverkamp <markh at osdl.org>




More information about the Openais mailing list