defect 1170 - assert in memb_state_recover_enter (was) Re: [Openais] looks like the synchronization code is still broke

Steven Dake sdake at redhat.com
Fri Apr 7 09:33:43 PDT 2006


Fabien,
When you get these crashes please post the last 2-3 pages of log output
as well as instance and any local variables.

I see 3 separate bugs here.  Could you file bugzillas on the 3 bugs and
I'll work on seeing if I can figure out why they are crashing?  Make
sure to attach these backtraces and the appropriate logs.

Note I don't see any of these crashes, so it must only occur with the
slower devices you are using (some kind of timing problem).

I will get to the bottom of your crashes.

Regards
-steve

On Fri, 2006-04-07 at 10:17 +0200, Fabien THOMAS wrote:
> i've set the MTU to 1400 and it seems to solve the problem i have  
> before but that still doesnt works: i've 3 differents cores now :(
> 
> i've 4 nodes with random kill on aisexec on 3 nodes and one that  
> randomly access checkpoint data.
> in less that 3 minutes 2 of the 4 nodes crashed at least 3 times.
> 
> find the 3 cores information attached:
> 
> core1 (10.2.20.254,07-04-aisexec-bug3.core):
> 
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and  
> you are
> welcome to change it and/or distribute copies of it under certain  
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for  
> details.
> This GDB was configured as "i386-marcel-freebsd"...
> Core was generated by `aisexec'.
> Program terminated with signal 11, Segmentation fault.
> Reading symbols from /usr/lib/libpthread.so.2...done.
> Loaded symbols for /usr/lib/libpthread.so.2
> Reading symbols from /lib/libc.so.6...done.
> Loaded symbols for /lib/libc.so.6
> Reading symbols from /libexec/ld-elf.so.1...done.
> Loaded symbols for /libexec/ld-elf.so.1
> #0  0x281893e6 in memcpy () from /lib/libc.so.6
> [New LWP 100061]
> (gdb) bt
> #0  0x281893e6 in memcpy () from /lib/libc.so.6
> #1  0x08060ce6 in compute () at ykd.c:238
> #2  0x080612b5 in ykd_deliver_fn (source_addr=0x3fbfe9b0,  
> iovec=0x3fbfc5c0,
>      iov_len=1, endian_conversion_required=0) at ykd.c:401
> #3  0x08057bba in app_deliver_fn (source_addr=0x3fbfe9b0,  
> iovec=0x808b148,
>      iov_len=1, endian_conversion_required=0) at totempg.c:343
> #4  0x080579dd in totempg_deliver_fn (source_addr=0x3fbfe9b0,  
> iovec=0x83d5658,
>      iov_len=1, endian_conversion_required=0) at totempg.c:539
> #5  0x08056f05 in totemmrp_deliver_fn (source_addr=0x3fbfe9b0,
>      iovec=0x83d5658, iov_len=1, endian_conversion_required=0) at  
> totemmrp.c:81
> #6  0x08054cc3 in messages_deliver_to_app (instance=0x83c6000, skip=0,
>      end_point=386) at totemsrp.c:3164
> #7  0x08055065 in message_handler_mcast (instance=0x83c6000,
>      system_from=0x3fbfeb90, msg=0x83e2650, msg_len=93,
>      endian_conversion_needed=0) at totemsrp.c:3301
> #8  0x08056dc5 in main_deliver_fn (context=0x83c6000,  
> system_from=0x3fbfeb90,
>      msg=0x83e2650, msg_len=93) at totemsrp.c:3720
> #9  0x0804df12 in active_mcast_recv (instance=0x83b4700,  
> context=0x83c6000,
>      system_from=0x3fbfeb90, msg=0x83e2650, msg_len=93) at totemrrp.c: 
> 393
> #10 0x0804e2be in rrp_deliver_fn (context=0x83b5670,  
> system_from=0x3fbfeb90,
>      msg=0x83e2650, msg_len=93) at totemrrp.c:549
> #11 0x0804c3b6 in net_deliver_fn (handle=0, fd=6, revents=1,  
> data=0x83e2000,
>      prio=0x83c1440) at totemnet.c:687
> #12 0x0804ab76 in poll_run (handle=0) at aispoll.c:424
> ---Type <return> to continue, or q <return> to quit---
> #13 0x0805fdaf in main (argc=1, argv=0x3fbfee88) at main.c:1317
> (gdb) frame 1
> #1  0x08060ce6 in compute () at ykd.c:238
> 238     ykd.c: No such file or directory.
>          in ykd.c
> (gdb) print i
> $1 = 0
> (gdb) print j
> $2 = 3375
> (gdb) print  state_received_process 
> [i].ykd_state.ambiguous_sessions_entries
> $3 = 1212459015
> (gdb) print  state_received_process[i].ykd_state
> $4 = {last_primary = {member_list = {{nodeid = 34209795, family = 1537,
>          addr = "\002\000\n\002\001\006\000\000\000\000\000\000\000 
> \000\000"}, {
>          nodeid = 0, family = 0,
>          addr = "\000\000\000\000\005\000stat\000\000\000\000\000"}, {
>          nodeid = 0, family = 0,
>          addr = '\0' <repeats 15 times>} <repeats 11 times>, {nodeid  
> = 0,
>          family = 19,
>          addr = "\000\000\000\000\000\000\000\000\000\000\b\205\000 
> \000F2"}, {
>          nodeid = 1096298544, family = 12337,
>          addr = "5910400601\000\001\000\000\000"}, {nodeid = 1526726789,
>          family = 19579, addr = "?\r", '\0' <repeats 13 times>},  
> {nodeid = 0,
>          family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,  
> family = 0,
>          addr = '\0' <repeats 15 times>}, {nodeid = 3904256, family = 0,
>          addr = "\000\000\000\000?\001\000\000?\003\002dynam"}, {
>          nodeid = 1830839145, family = 28005,
>          addr = "ory", '\0' <repeats 12 times>}, {nodeid = 0, family  
> = 0,
>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>          addr = '\0' <repeats 12 times>, "?\001\001t"}, {nodeid =  
> 1869639013,
>          family = 24946, addr = "ry storage\000\000\000\000\000"},  
> {nodeid = 0,
>          family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,  
> family = 0,
>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
> ---Type <return> to continue, or q <return> to quit---
>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>          addr = '\0' <repeats 15 times>}, {nodeid = 1929445880,  
> family = 29283,
>          addr = "ipting engine\000\000"}}, member_list_entries = 0,
>      session_id = 0}, last_formed = {{member_list = {{nodeid = 0,  
> family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = "\000\000\000\000\000\000\000?\001\000\000\000?\200 
> \000p"}, {
>            nodeid = 1768387948, family = 110, addr = '\0' <repeats 15  
> times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 1677827264, family = 29793,
>            addr = "a tracking\000\000\000\000\000"}, {nodeid = 0,  
> family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 504,
>            addr = "\001filter\000\000\000\000\000\000\000\000"},  
> {nodeid = 0,
> ---Type <return> to continue, or q <return> to quit---
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = "\000\000?\001\001fragment\000\000"}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0,
>            addr = "\000@?\001\000\000?\001\001matchin"}, {nodeid =  
> 1852121191,
>            family = 26983, addr = "ne", '\0' <repeats 13 times>},  
> {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}},
>        member_list_entries = 0, session_id = 0}, {member_list =  
> {{nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0,
>            addr = "\000\000?\001\001QoS\000\000\000\000\000\000\000"}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
> ---Type <return> to continue, or q <return> to quit---
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0,
>            addr = "\000\000\000\000\000\000?\001\001HTML pa"}, {
>            nodeid = 1919251314, family = 0, addr = '\0' <repeats 15  
> times>}, {
>            nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>} <repeats 21 times>},
>        member_list_entries = 0, session_id = 0}, {member_list =  
> {{nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>} <repeats 20  
> times>, {
>            nodeid = 0, family = 0,
>            addr = "\000@\003", '\0' <repeats 12 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0,
>            addr = "\000\000\000\000\000\000\000\200\000\000\000\000 
> \000\000\000"}, {nodeid = 0, family = 0, addr = '\0' <repeats 15  
> times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 17664, addr = "+F", '\0' <repeats 13 times>, "E"}, {
>            nodeid = 17963, family = 0, addr = '\0' <repeats 15  
> times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}},
>        member_list_entries = 0, session_id = 1177240832},  
> {member_list = {{
> ---Type <return> to continue, or q <return> to quit---
>            nodeid = 0, family = 0,
>            addr = "\000\000\000\000\000\000\000E+F\000\000\000\000 
> \000"}, {
>            nodeid = 0, family = 0,
>            addr = "\000\000\000\000\000\000\000\000\000/\201\000\000 
> \000\000"},
>          {nodeid = 7936, family = 0,
>            addr = "\000\000\000\037\000\000\000\000\000\000\000?\001 
> \000\000"},
>          {nodeid = 587202560, family = 127, addr = '\0' <repeats 15  
> times>}, {
>            nodeid = 0, family = 6144,
>            addr = "\000\000\000\000\000\000\000e\025\t\001\000\000\000 
> \000\034"}, {nodeid = 250, family = 0,
>            addr = "\000\211\031\b\001\000\000\000\000\000\000\000\000 
> \000\000"}, {nodeid = 733874688, family = 149,
>            addr = "\000\000\000?6\000\000]3\t\000??\000\000\035"}, {
>            nodeid = 1023410639, family = 49259,
>            addr = "?\r\000\000\000^m+\225\000\000\000\000?Ǹ"},  
> {nodeid = 3498,
>            family = 25600,
>            addr = "\207?\224\000\000\000\000\000?M\000\000\000\000 
> \000?"}, {
>            nodeid = 205, family = 0,
>            addr = "\000\224?L", '\0' <repeats 11 times>}, {nodeid =  
> 2328321792,
>            family = 58,
>            addr = "\000\000\000?\021\000\000??\003\000?;\000\000 
> \231"}, {
>            nodeid = 1023410313, family = 49259,
>            addr = "?\r\000\000\000??\212:\000\000\000\000?\0003"}, {
>            nodeid = 3499, family = 6144,
> ---Type <return> to continue, or q <return> to quit---
>            addr = "?z:\000\000\000\000\001\000\000\000\016\000\000 
> \000"}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0,
>            addr = "\000\000\000=k??\r\000\000\000\000\000\000\000"}, {
>            nodeid = 889192448, family = 41377,
>            addr = "?\r", '\0' <repeats 13 times>}, {nodeid = 0,  
> family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = "\000\000\000=k??\r\000\000\000\000\000\000\000"}, {
>            nodeid = 889192448, family = 41377,
>            addr = "?\r", '\0' <repeats 13 times>}, {nodeid = 0,  
> family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>            addr = "\000\000\000\000\000\000\000\000\000=k??\r\000"}, {
>            nodeid = 0, family = 0,
>            addr = "\000\000\0005???\r\000\000\000\000\000\000\000"}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}},
> ---Type <return> to continue, or q <return> to quit---
>        member_list_entries = 0, session_id = 0}, {member_list =  
> {{nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0,
>            addr = "\000=k??\r\000\000\000\000\000\000\000\000\000"}, {
>            nodeid = 2711696640, family = 3500, addr = '\0' <repeats  
> 15 times>},
>          {nodeid = 0, family = 0, addr = '\0' <repeats 15 times>},  
> {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0,
>            addr = "\000\000\000\000\000\000\000=k??\r\000\000\000"}, {
>            nodeid = 0, family = 0,
>            addr = "\0005???\r\000\000\000\000\000\000\000\000\000"}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0,
>            addr = "\000\000\000\000\000\000\000=k??\r\000\000\000"}, {
>            nodeid = 0, family = 0,
>            addr = "\0005???\r\000\000\000\000\000\000\000\000\000"}, {
>            nodeid = 0, family = 0,
>            addr = "\000\000\0002\000\000\000\000\000\000\000\002\000 
> \000\000"},
>          {nodeid = 33554432, family = 0,
>            addr = '\0' <repeats 13 times>, "2\000"}, {nodeid = 0,  
> family = 0,
> ---Type <return> to continue, or q <return> to quit---
>            addr = '\0' <repeats 15 times>, "\001"}, {nodeid = 0,  
> family = 0,
>            addr = "\000(=", '\0' <repeats 12 times>}, {nodeid = 4007936,
>            family = 0, addr = '\0' <repeats 11 times>, "(?-\000"},  
> {nodeid = 0,
>            family = 0,
>            addr = "\000\035=\000\000\000\000\000\000\004\001\000 
> \000=k?"}, {
>            nodeid = 3500, family = 10240,
>            addr = "?-\000\000\000\000\0005???\r\000\000\000("}, {
>            nodeid = 11689, family = 0, addr = '\0' <repeats 15  
> times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 13 times>,  
> "=k?"}, {
>            nodeid = 3500, family = 0,
>            addr = "\000\000\000\000\000\000\0005???\r\000\000\000"}, {
>            nodeid = 0, family = 0,
>            addr = "\000\000\000\000\000\001\000\000\000\000\000\000 
> \000\000\000"}, {nodeid = 0, family = 0, addr = '\0' <repeats 15  
> times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid =  
> 1023410176,
>            family = 49259, addr = "?\r", '\0' <repeats 11 times>,  
> "5??"}},
>        member_list_entries = 3500, session_id = 0}, {member_list = {{
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>            nodeid = 0, family = 0, addr = '\0' <repeats 11 times>,  
> "=k??\r"}, {
> ---Type <return> to continue, or q <return> to quit---q
> noQuit
> (gdb)
> 
> ======================================================================== 
> =================================================
> core2 (10.2.1.7,07-04-aisexec-bug1.core):
> 
> 
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and  
> you are
> welcome to change it and/or distribute copies of it under certain  
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for  
> details.
> This GDB was configured as "i386-marcel-freebsd"...
> Attaching to program: /log/07-04-aisexec-bug1, process 7
> ptrace: Invalid argument.
> 
> warning: core file may not match specified executable file.
> Core was generated by `aisexec'.
> Program terminated with signal 6, Aborted.
> Reading symbols from /usr/lib/libpthread.so.2...done.
> Loaded symbols for /usr/lib/libpthread.so.2
> Reading symbols from /lib/libc.so.6...bdone.
> Loaded symbols for /lib/libc.so.6
> Reading symbols from /libexec/ld-elf.so.1...done.
> Loaded symbols for /libexec/ld-elf.so.1
> #0  0x28187723 in kill () from /lib/libc.so.6
> [New LWP 100102]
> (gdb) bt
> #0  0x28187723 in kill () from /lib/libc.so.6
> #1  0x280b61da in raise () from /usr/lib/libpthread.so.2
> #2  0x281863d4 in abort () from /lib/libc.so.6
> #3  0x28164358 in __assert () from /lib/libc.so.6
> #4  0x08050aa2 in sq_item_get (sq=0x83c77c0, seq_id=256,
>      sq_item_out=0x3fbfde10) at sq.h:254
> #5  0x08052558 in update_aru (instance=0x83c6000) at totemsrp.c:1964
> #6  0x080527de in orf_token_mcast (instance=0x83c6000, token=0x3fbfe4c0,
>      fcc_mcasts_allowed=0, system_from=0x3fbfeb90) at totemsrp.c:2061
> #7  0x080543f0 in message_handler_orf_token (instance=0x83c6000,
>      system_from=0x3fbfeb90, msg=0x83e2650, msg_len=88,
>      endian_conversion_needed=0) at totemsrp.c:2920
> #8  0x08056dc5 in main_deliver_fn (context=0x83c6000,  
> system_from=0x3fbfeb90,
>      msg=0x83e2650, msg_len=88) at totemsrp.c:3720
> #9  0x0804e193 in active_token_recv (instance=0x83b4700, interface_no=0,
>      context=0x83c6000, system_from=0x3fbfeb90, msg=0x83e2650,  
> msg_len=88,
>      token_seqid=272) at totemrrp.c:477
> #10 0x0804e296 in rrp_deliver_fn (context=0x83b5670,  
> system_from=0x3fbfeb90,
>      msg=0x83e2650, msg_len=88) at totemrrp.c:537
> #11 0x0804c3b6 in net_deliver_fn (handle=0, fd=8, revents=1,  
> data=0x83e2000,
>      prio=0x83c1454) at totemnet.c:687
> #12 0x0804ab76 in poll_run (handle=0) at aispoll.c:424
> #13 0x0805fdaf in main (argc=1, argv=0x3fbfee88) at main.c:1317
> (gdb) frame 4
> #4  0x08050aa2 in sq_item_get (sq=0x83c77c0, seq_id=256,
>      sq_item_out=0x3fbfde10) at sq.h:254
> 254     sq.h: No such file or directory.
>          in sq.h
> (gdb) print *sq
> $1 = {head = 0, size = 256, items = 0x83d7000, items_inuse = 0x83c4000,
>    size_per_item = 44, head_seqid = 0, item_count = 256, pos_max = 255}
> (gdb) frame 5
> #5  0x08052558 in update_aru (instance=0x83c6000) at totemsrp.c:1964
> 1964    totemsrp.c: No such file or directory.
>          in totemsrp.c
> (gdb) print *instance
> $2 = {first_run = 1, fcc_remcast_last = 0, fcc_mcast_last = 0,
>    fcc_mcast_current = 0, fcc_remcast_current = 0, consensus_list =  
> {{addr = {
>          nodeid = 117506570, family = 2,
>          addr = "\n\002\001\a??;?\005\bL1;\b`?"}, set = 1}, {addr = {
>          nodeid = 100729354, family = 2,
>          addr = "\n\002\001\006??;?\005\bL1;\b\000?"}, set = 1},  
> {addr = {
>          nodeid = 4262724106, family = 2,
>          addr = "\n\002\024???;?\005\bL1;\b\000?"}, set = 1}, {addr = {
>          nodeid = 84607498, family = 2,
>          addr = "\n\002\v\005??;?\005\bL1;\b\000?"}, set = 1}, {addr = {
>          nodeid = 0, family = 0, addr = '\0' <repeats 15 times>},
>        set = 0} <repeats 28 times>}, consensus_list_entries = 4,
>    my_proc_list = {{nodeid = 100729354, family = 2,
>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =  
> 117506570,
>        family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>        nodeid = 4262724106, family = 2,
>        addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid =  
> 84607498,
>        family = 2, addr = "\n\002\v\005", '\0' <repeats 11 times>}, {
>        nodeid = 0, family = 0,
>        addr = '\0' <repeats 15 times>} <repeats 28 times>},  
> my_failed_list = {{
>        nodeid = 4262724106, family = 2, addr = "\n\002\024???;?\005 
> \bL1;\b??"},
>      {nodeid = 84607498, family = 2,
>        addr = "\n\002\v\005", '\0' <repeats 11 times>}, {nodeid = 0,
>        family = 0, addr = '\0' <repeats 15 times>} <repeats 30 times>},
> ---Type <return> to continue, or q <return> to quit---
>    my_new_memb_list = {{nodeid = 100729354, family = 2,
>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =  
> 117506570,
>        family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>        nodeid = 84607498, family = 2,
>        addr = "\n\002\v\005", '\0' <repeats 11 times>}, {nodeid =  
> 4262724106,
>        family = 2, addr = "\n\002\024?", '\0' <repeats 11 times>},  
> {nodeid = 0,
>        family = 0, addr = '\0' <repeats 15 times>} <repeats 28 times>},
>    my_trans_memb_list = {{nodeid = 100729354, family = 2,
>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =  
> 117506570,
>        family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>        nodeid = 4262724106, family = 2,
>        addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid = 0,  
> family = 0,
>        addr = '\0' <repeats 15 times>} <repeats 29 times>},  
> my_memb_list = {{
>        nodeid = 100729354, family = 2,
>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =  
> 117506570,
>        family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>        nodeid = 4262724106, family = 2,
>        addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid =  
> 4262724106,
>        family = 2, addr = "\n\002\024?", '\0' <repeats 11 times>},  
> {nodeid = 0,
>        family = 0, addr = '\0' <repeats 15 times>} <repeats 28 times>},
>    my_deliver_memb_list = {{nodeid = 100729354, family = 2,
>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =  
> 117506570,
>        family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>        nodeid = 4262724106, family = 2,
> ---Type <return> to continue, or q <return> to quit---
>        addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid = 0,  
> family = 0,
>        addr = '\0' <repeats 15 times>} <repeats 29 times>},
>    my_nodeid_lookup_list = {{nodeid = 117506570, family = 2,
>        addr = "\n\002\001\a", '\0' <repeats 11 times>}, {nodeid =  
> 100729354,
>        family = 2, addr = "\n\002\001\006", '\0' <repeats 11 times>}, {
>        nodeid = 84607498, family = 2,
>        addr = "\n\002\v\005", '\0' <repeats 11 times>}, {nodeid =  
> 4262724106,
>        family = 2, addr = "\n\002\024?", '\0' <repeats 11 times>},  
> {nodeid = 0,
>        family = 0, addr = '\0' <repeats 15 times>} <repeats 28 times>},
>    my_proc_list_entries = 4, my_failed_list_entries = 0,
>    my_new_memb_entries = 4, my_trans_memb_entries = 3,  
> my_memb_entries = 3,
>    my_deliver_memb_entries = 3, my_nodeid_lookup_entries = 4,  
> my_ring_id = {
>      rep = {nodeid = 100729354, family = 2,
>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, seq = 87120},
>    my_old_ring_id = {rep = {nodeid = 100729354, family = 2,
>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, seq = 87116},
>    my_aru_count = 0, my_merge_detect_timeout_outstanding = 0,
>    my_last_aru = 272, my_seq_unchanged = 0, my_received_flg = 0,
>    my_high_seq_received = 272, my_install_seq = 0,  
> my_rotation_counter = 0,
>    my_set_retrans_flg = 1, my_retrans_flg_count = 0,
>    my_high_ring_delivered = 0, heartbeat_timeout = 764,  
> new_message_queue = {
>      head = 108, tail = 50, used = 57, usedhw = 57, size = 195,
>      items = 0x83e7000, size_per_item = 48, iterator = 0},
>    retrans_message_queue = {head = 0, tail = 499, used = 0, usedhw = 0,
> ---Type <return> to continue, or q <return> to quit---
>      size = 500, items = 0x83ce000, size_per_item = 48, iterator = 0},
>    regular_sort_queue = {head = 0, size = 256, items = 0x83d4000,
>      items_inuse = 0x83c0c00, size_per_item = 44, head_seqid = 0,
>      item_count = 256, pos_max = 0}, recovery_sort_queue = {head = 0,
>      size = 256, items = 0x83d7000, items_inuse = 0x83c4000,
>      size_per_item = 44, head_seqid = 0, item_count = 256, pos_max =  
> 255},
>    my_aru = 255, my_high_delivered = 0,  
> token_callback_received_listhead = {
>      next = 0x83b3440, prev = 0x83b3440},  
> token_callback_sent_listhead = {
>      next = 0x83c77f0, prev = 0x83c77f0}, orf_token_retransmit =  
> 0x83ca000 "",
>    orf_token_retransmit_size = 88, my_token_seq = 64,
>    timer_orf_token_timeout = 0x856d660,
>    timer_orf_token_retransmit_timeout = 0x85f0d20,
>    timer_orf_token_hold_retransmit_timeout = 0x0,
>    timer_merge_detect_timeout = 0x0,
>    memb_timer_state_gather_join_timeout = 0x0,
>    memb_timer_state_gather_consensus_timeout = 0x0,
>    memb_timer_state_commit_timeout = 0x0, timer_heartbeat_timeout =  
> 0x83b33e0,
>    totemsrp_log_level_security = 65538, totemsrp_log_level_error =  
> 131074,
>    totemsrp_log_level_warning = 196610, totemsrp_log_level_notice =  
> 262146,
>    totemsrp_log_level_debug = 327682,
>    totemsrp_log_printf = 0x805ff60 <internal_log_printf>,
>    memb_state = MEMB_STATE_RECOVERY, my_id = {nodeid = 117506570,  
> family = 2,
>      addr = "\n\002\001\a", '\0' <repeats 11 times>}, next_memb = {
>      nodeid = 84607498, family = 2,
> ---Type <return> to continue, or q <return> to quit---
>      addr = "\n\002\v\005", '\0' <repeats 11 times>},
>    iov_buffer = '\0' <repeats 8999 times>, totemsrp_iov_recv =  
> {iov_base = 0x0,
>      iov_len = 0}, totemsrp_poll_handle = 0, totemsrp_recv = 0,
>    mcast_address = {nodeid = 0, family = 2,
>      addr = "?^\001\003", '\0' <repeats 11 times>},
>    totemsrp_deliver_fn = 0x8056edc <totemmrp_deliver_fn>,
>    totemsrp_confchg_fn = 0x8056f10 <totemmrp_confchg_fn>,  
> global_seqno = 246,
>    my_token_held = 0, token_ring_id_seq = 87120, last_released = 0,
>    set_aru = 4294967295, old_ring_state_saved = 1, old_ring_state_aru  
> = 0,
>    old_ring_state_high_seq_received = 0, ring_saved = 1, my_last_seq  
> = 272,
>    tv_old = {tv_sec = 0, tv_usec = 0}, totemrrp_handle = 0,
>    totem_config = 0x3fbfed14, use_heartbeat = 1, my_trc = 0, my_pbl = 0}
> (gdb)
> (gdb)
> (gdb) print range
> $3 = 17
> 
> 
> ======================================================================== 
> =================================================
> core3 (10.2.1.7,07-04-aisexec-bug2.core):
> 
> F200XA105910400601>gdb 07-04-aisexec-bug2 07-04-aisexec-bug2.core
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and  
> you are
> welcome to change it and/or distribute copies of it under certain  
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for  
> details.
> This GDB was configured as "i386-marcel-freebsd"...
> Attaching to program: /log/07-04-aisexec-bug2, process 7
> ptrace: Invalid argument.
> 
> warning: core file may not match specified executable file.
> Core was generated by `aisexec'.
> Program terminated with signal 11, Segmentation fault.
> Reading symbols from /usr/lib/libpthread.so.2...done.
> Loaded symbols for /usr/lib/libpthread.so.2
> Reading symbols from /lib/libc.so.6...done.
> Loaded symbols for /lib/libc.so.6
> Reading symbols from /libexec/ld-elf.so.1...done.
> Loaded symbols for /libexec/ld-elf.so.1
> #0  0x280f8e21 in memcmp () from /lib/libc.so.6
> [New LWP 100093]
> (gdb) bt
> #0  0x280f8e21 in memcmp () from /lib/libc.so.6
> #1  0x08057ce7 in group_matches (iovec=0x808b148, iov_len=1,
>      groups_b=0x83b56d0, group_b_cnt=1, adjust_iovec=0x3fbfc5bc)
>      at totempg.c:308
> #2  0x08057b86 in app_deliver_fn (source_addr=0x3fbfe9b0,  
> iovec=0x808b148,
>      iov_len=1, endian_conversion_required=0) at totempg.c:340
> #3  0x080579dd in totempg_deliver_fn (source_addr=0x3fbfe9b0,  
> iovec=0x83d5760,
>      iov_len=1, endian_conversion_required=0) at totempg.c:539
> #4  0x08056f05 in totemmrp_deliver_fn (source_addr=0x3fbfe9b0,
>      iovec=0x83d5760, iov_len=1, endian_conversion_required=0) at  
> totemmrp.c:81
> #5  0x08054cc3 in messages_deliver_to_app (instance=0x83c6000, skip=0,
>      end_point=136) at totemsrp.c:3164
> #6  0x08055065 in message_handler_mcast (instance=0x83c6000,
>      system_from=0x3fbfeb90, msg=0x83e2650, msg_len=1336,
>      endian_conversion_needed=0) at totemsrp.c:3301
> #7  0x08056dc5 in main_deliver_fn (context=0x83c6000,  
> system_from=0x3fbfeb90,
>      msg=0x83e2650, msg_len=1336) at totemsrp.c:3720
> #8  0x0804df12 in active_mcast_recv (instance=0x83b4700,  
> context=0x83c6000,
>      system_from=0x3fbfeb90, msg=0x83e2650, msg_len=1336) at  
> totemrrp.c:393
> #9  0x0804e2be in rrp_deliver_fn (context=0x83b5670,  
> system_from=0x3fbfeb90,
>      msg=0x83e2650, msg_len=1336) at totemrrp.c:549
> #10 0x0804c3b6 in net_deliver_fn (handle=0, fd=6, revents=1,  
> data=0x83e2000,
>      prio=0x83b4880) at totemnet.c:687
> #11 0x0804ab76 in poll_run (handle=0) at aispoll.c:424
> ---Type <return> to continue, or q <return> to quit---frame 1
> #12 0x0805fdaf in main (argc=1, argv=0x3fbfee88) at main.c:1317
> (gdb) print *iovec
> No symbol "iovec" in current context.
> (gdb) frame 1
> #1  0x08057ce7 in group_matches (iovec=0x808b148, iov_len=1,
>      groups_b=0x83b56d0, group_b_cnt=1, adjust_iovec=0x3fbfc5bc)
>      at totempg.c:308
> 308     totempg.c: No such file or directory.
>          in totempg.c
> (gdb) print *iovec
> $1 = {iov_base = 0x8525016, iov_len = 72033}
> (gdb) print *groups_b
> $2 = {group = 0x807e617, group_len = 1}
> (gdb)
> $3 = {group = 0x807e617, group_len = 1}
> (gdb) print i
> $4 = 7833
> (gdb) print j
> $5 = 0
> (gdb) print group_len[0]
> $6 = 41035
> (gdb)
> 




More information about the Openais mailing list