[Openais] Re: defect 169 fixed up (and 172)

Mark Haverkamp markh at osdl.org
Wed Oct 27 07:37:29 PDT 2004


On Tue, 2004-10-26 at 23:58 -0700, Steven Dake wrote:
> Mark,
> 
> I have a patch for defect 169 (assert on ifdown).  If the interface is
> downed during operation, the processor will enter a singleton
> configuration and continue to operate.  If the interface is then uped
> later, the processor will attempt to join any other configurations it
> can locate on the multicast address.  In the process I found a pretty
> nasty bug (defect 172) which causes two singleton configurations not to
> be able to form a configuration because the local variables that are
> normally reset by the evs algorithm are not changed in the singleton
> configuration mode.
> 
> If you could give it a spin and let me know how it works for you, that'd
> be cool.

OK, I tried it.  The other nodes recovered OK.  The node that I did the
ifdown on didn't though.  I had all the nodes sending and receiving
events at the time.

Here is where I did the ifdown:

Oct 27  7:27:39 [WARNING ] [GMI  ] Token loss in OPERATIONAL.
log_syslog: lost message: Resource temporarily unavailable
Oct 27  7:27:39 [NOTICE  ] [GMI  ] entering GATHER state.
Oct 27  7:27:39 [NOTICE  ] [GMI  ] SENDING attempt join because this node is ring rep.
log_syslog: lost message: Resource temporarily unavailable
Oct 27  7:27:39 [NOTICE  ] [GMI  ] I am the only member.
log_syslog: lost message: Resource temporarily unavailable
Oct 27  7:27:39 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
Oct 27  7:27:39 [NOTICE  ] [CLM  ] New Configuration:
Oct 27  7:27:39 [NOTICE  ] [CLM  ]      192.168.1.18
log_syslog: lost message: Resource temporarily unavailable
Oct 27  7:27:39 [NOTICE  ] [CLM  ] Members Left:
Oct 27  7:27:39 [NOTICE  ] [CLM  ]      192.168.1.8
Oct 27  7:27:39 [NOTICE  ] [CLM  ]      192.168.1.17
log_syslog: lost message: Resource temporarily unavailable
Oct 27  7:27:39 [NOTICE  ] [CLM  ]      192.168.1.19
Oct 27  7:27:39 [NOTICE  ] [CLM  ] Members Joined:
Oct 27  7:27:39 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
Oct 27  7:27:39 [NOTICE  ] [CLM  ] New Configuration:
Oct 27  7:27:39 [NOTICE  ] [CLM  ]      192.168.1.18
Oct 27  7:27:39 [NOTICE  ] [CLM  ] Members Left:
Oct 27  7:27:39 [NOTICE  ] [CLM  ] Members Joined:

That's all. I interrupted the program and did a stack dump.  It was in
poll_run.

Program received signal SIGINT, Interrupt.
0x420d224b in poll () from /lib/i686/libc.so.6
(gdb) bt
#0  0x420d224b in poll () from /lib/i686/libc.so.6
#1  0x08057fc0 in poll_run (handle=0) at aispoll.c:371
#2  0x0804a736 in main (argc=1, argv=0xbffffa24) at main.c:989
#3  0x420158d4 in __libc_start_main () from /lib/i686/libc.so.6
(gdb) c
Continuing.


(the "log_syslog: lost message: Resource temporarily unavailable"
messages are some debug I added to see why I was dropping messages from
syslog).


> 
> There may be some kind of bug with this because ckptbench freezes
> (meaning it lost some messages) sometimes during operation.
> 
> Thanks
> -steve
> 
-- 
Mark Haverkamp <markh at osdl.org>




More information about the Openais mailing list