[Openais] Re: defect 169 fixed up (and 172)
Mark Haverkamp
markh at osdl.org
Wed Oct 27 07:37:29 PDT 2004
On Tue, 2004-10-26 at 23:58 -0700, Steven Dake wrote:
> Mark,
>
> I have a patch for defect 169 (assert on ifdown). If the interface is
> downed during operation, the processor will enter a singleton
> configuration and continue to operate. If the interface is then uped
> later, the processor will attempt to join any other configurations it
> can locate on the multicast address. In the process I found a pretty
> nasty bug (defect 172) which causes two singleton configurations not to
> be able to form a configuration because the local variables that are
> normally reset by the evs algorithm are not changed in the singleton
> configuration mode.
>
> If you could give it a spin and let me know how it works for you, that'd
> be cool.
OK, I tried it. The other nodes recovered OK. The node that I did the
ifdown on didn't though. I had all the nodes sending and receiving
events at the time.
Here is where I did the ifdown:
Oct 27 7:27:39 [WARNING ] [GMI ] Token loss in OPERATIONAL.
log_syslog: lost message: Resource temporarily unavailable
Oct 27 7:27:39 [NOTICE ] [GMI ] entering GATHER state.
Oct 27 7:27:39 [NOTICE ] [GMI ] SENDING attempt join because this node is ring rep.
log_syslog: lost message: Resource temporarily unavailable
Oct 27 7:27:39 [NOTICE ] [GMI ] I am the only member.
log_syslog: lost message: Resource temporarily unavailable
Oct 27 7:27:39 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Oct 27 7:27:39 [NOTICE ] [CLM ] New Configuration:
Oct 27 7:27:39 [NOTICE ] [CLM ] 192.168.1.18
log_syslog: lost message: Resource temporarily unavailable
Oct 27 7:27:39 [NOTICE ] [CLM ] Members Left:
Oct 27 7:27:39 [NOTICE ] [CLM ] 192.168.1.8
Oct 27 7:27:39 [NOTICE ] [CLM ] 192.168.1.17
log_syslog: lost message: Resource temporarily unavailable
Oct 27 7:27:39 [NOTICE ] [CLM ] 192.168.1.19
Oct 27 7:27:39 [NOTICE ] [CLM ] Members Joined:
Oct 27 7:27:39 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Oct 27 7:27:39 [NOTICE ] [CLM ] New Configuration:
Oct 27 7:27:39 [NOTICE ] [CLM ] 192.168.1.18
Oct 27 7:27:39 [NOTICE ] [CLM ] Members Left:
Oct 27 7:27:39 [NOTICE ] [CLM ] Members Joined:
That's all. I interrupted the program and did a stack dump. It was in
poll_run.
Program received signal SIGINT, Interrupt.
0x420d224b in poll () from /lib/i686/libc.so.6
(gdb) bt
#0 0x420d224b in poll () from /lib/i686/libc.so.6
#1 0x08057fc0 in poll_run (handle=0) at aispoll.c:371
#2 0x0804a736 in main (argc=1, argv=0xbffffa24) at main.c:989
#3 0x420158d4 in __libc_start_main () from /lib/i686/libc.so.6
(gdb) c
Continuing.
(the "log_syslog: lost message: Resource temporarily unavailable"
messages are some debug I added to see why I was dropping messages from
syslog).
>
> There may be some kind of bug with this because ckptbench freezes
> (meaning it lost some messages) sometimes during operation.
>
> Thanks
> -steve
>
--
Mark Haverkamp <markh at osdl.org>
More information about the Openais
mailing list