[Openais] Re: new batch of bugs - was defect 1170 - assert in memb_state_recover_enter

Fabien THOMAS fabien.thomas at netasq.com
Mon Apr 24 05:11:13 PDT 2006


I'm back from holiday :)

> I've been swamped lately but have had a chance to spend a day studying
> these problems.
>
> When you run in debug mode could you run with "secauth" turned to on?
> This will verify that messages are being properly received over the
> network.
>
ok
> For core 1 I'm not sure how the max sessions is being set to such a
> large value.  This is the cause of the segfault.
>
> [i].ykd_state.ambiguous_sessions_entries
> $3 = 1212459015
>
> Could you try the attached core-1-debug.patch debug patch against  
> trunk
> in your environment?  We are looking for ambiguous sessions that look
> large.  Logs of the output of stdout are helpful in this case.
>
ok i will update to latest trunk and do my stress test with the patch

> I have duplicated core 2 and understand the problem.  I will work  
> out a
> patch this week.
>
> Are you seeing alot of reconfigurations during normal running  
> operation?

no but i'm doing forced reconfiguration because i kill aisexec at  
random times on each node.

> Are you in a cross endian environment?
>
no
> For core 3 I believe this problem is fixed by the strict aliasing
> patches that went into the tree.  I ran into this same problem on fc 5
> with gcc 4.1.  Which version of gcc are you using?

$gcc -v
Using built-in specs.
Configured with: FreeBSD/i386 system compiler
Thread model: posix
gcc version 3.4.4 [FreeBSD] 20050518

i'm also use DEBUG build to have better debugging support (-O0).

>
> Regards
> -steve
>
> For core 2 below On Fri, 2006-04-07 at 10:17 +0200, Fabien THOMAS  
> wrote:
>> i've set the MTU to 1400 and it seems to solve the problem i have
>> before but that still doesnt works: i've 3 differents cores now :(
>>
>> i've 4 nodes with random kill on aisexec on 3 nodes and one that
>> randomly access checkpoint data.
>> in less that 3 minutes 2 of the 4 nodes crashed at least 3 times.
>>
>> find the 3 cores information attached:
>>
>> core1 (10.2.20.254,07-04-aisexec-bug3.core):
>>
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
>> you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for
>> details.
>> This GDB was configured as "i386-marcel-freebsd"...
>> Core was generated by `aisexec'.
>> Program terminated with signal 11, Segmentation fault.
>> Reading symbols from /usr/lib/libpthread.so.2...done.
>> Loaded symbols for /usr/lib/libpthread.so.2
>> Reading symbols from /lib/libc.so.6...done.
>> Loaded symbols for /lib/libc.so.6
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0  0x281893e6 in memcpy () from /lib/libc.so.6
>> [New LWP 100061]
>> (gdb) bt
>> #0  0x281893e6 in memcpy () from /lib/libc.so.6
>> #1  0x08060ce6 in compute () at ykd.c:238
>> #2  0x080612b5 in ykd_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x3fbfc5c0,
>>      iov_len=1, endian_conversion_required=0) at ykd.c:401
>> #3  0x08057bba in app_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x808b148,
>>      iov_len=1, endian_conversion_required=0) at totempg.c:343
>> #4  0x080579dd in totempg_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x83d5658,
>>      iov_len=1, endian_conversion_required=0) at totempg.c:539
>> #5  0x08056f05 in totemmrp_deliver_fn (source_addr=0x3fbfe9b0,
>>      iovec=0x83d5658, iov_len=1, endian_conversion_required=0) at
>> totemmrp.c:81
>> #6  0x08054cc3 in messages_deliver_to_app (instance=0x83c6000,  
>> skip=0,
>>      end_point=386) at totemsrp.c:3164
>> #7  0x08055065 in message_handler_mcast (instance=0x83c6000,
>>      system_from=0x3fbfeb90, msg=0x83e2650, msg_len=93,
>>      endian_conversion_needed=0) at totemsrp.c:3301
>> #8  0x08056dc5 in main_deliver_fn (context=0x83c6000,
>> system_from=0x3fbfeb90,
>>      msg=0x83e2650, msg_len=93) at totemsrp.c:3720
>> #9  0x0804df12 in active_mcast_recv (instance=0x83b4700,
>> context=0x83c6000,
>>      system_from=0x3fbfeb90, msg=0x83e2650, msg_len=93) at  
>> totemrrp.c:
>> 393
>> #10 0x0804e2be in rrp_deliver_fn (context=0x83b5670,
>> system_from=0x3fbfeb90,
>>      msg=0x83e2650, msg_len=93) at totemrrp.c:549
>> #11 0x0804c3b6 in net_deliver_fn (handle=0, fd=6, revents=1,
>> data=0x83e2000,
>>      prio=0x83c1440) at totemnet.c:687
>> #12 0x0804ab76 in poll_run (handle=0) at aispoll.c:424
>> ---Type <return> to continue, or q <return> to quit---
>> #13 0x0805fdaf in main (argc=1, argv=0x3fbfee88) at main.c:1317
>> (gdb) frame 1
>> #1  0x08060ce6 in compute () at ykd.c:238
>> 238     ykd.c: No such file or directory.
>>          in ykd.c
>> (gdb) print i
>> $1 = 0
>> (gdb) print j
>> $2 = 3375
>> (gdb) print  state_received_process
>> [i].ykd_state.ambiguous_sessions_entries
>> $3 = 1212459015
>> (gdb) print  state_received_process[i].ykd_state
>> $4 = {last_primary = {member_list = {{nodeid = 34209795, family =  
>> 1537,
>>          addr = "\002\000\n\002\001\006\000\000\000\000\000\000\000
>> \000\000"}, {
>>          nodeid = 0, family = 0,
>>          addr = "\000\000\000\000\005\000stat\000\000\000\000 
>> \000"}, {
>>          nodeid = 0, family = 0,
>>          addr = '\0' <repeats 15 times>} <repeats 11 times>, {nodeid
>> = 0,
>>          family = 19,
>>          addr = "\000\000\000\000\000\000\000\000\000\000\b\205\000
>> \000F2"}, {
>>          nodeid = 1096298544, family = 12337,
>>          addr = "5910400601\000\001\000\000\000"}, {nodeid =  
>> 1526726789,
>>          family = 19579, addr = "?\r", '\0' <repeats 13 times>},
>> {nodeid = 0,
>>          family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0,
>>          addr = '\0' <repeats 15 times>}, {nodeid = 3904256,  
>> family = 0,
>>          addr = "\000\000\000\000?\001\000\000?\003\002dynam"}, {
>>          nodeid = 1830839145, family = 28005,
>>          addr = "ory", '\0' <repeats 12 times>}, {nodeid = 0, family
>> = 0,
>>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>          addr = '\0' <repeats 12 times>, "?\001\001t"}, {nodeid =
>> 1869639013,
>>          family = 24946, addr = "ry storage\000\000\000\000\000"},
>> {nodeid = 0,
>>          family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0,
>>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> ---Type <return> to continue, or q <return> to quit---
>>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>          addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>          addr = '\0' <repeats 15 times>}, {nodeid = 1929445880,
>> family = 29283,
>>          addr = "ipting engine\000\000"}}, member_list_entries = 0,
>>      session_id = 0}, last_formed = {{member_list = {{nodeid = 0,
>> family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = "\000\000\000\000\000\000\000?\001\000\000\000?\200
>> \000p"}, {
>>            nodeid = 1768387948, family = 110, addr = '\0' <repeats 15
>> times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 1677827264, family = 29793,
>>            addr = "a tracking\000\000\000\000\000"}, {nodeid = 0,
>> family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family =  
>> 504,
>>            addr = "\001filter\000\000\000\000\000\000\000\000"},
>> {nodeid = 0,
>> ---Type <return> to continue, or q <return> to quit---
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = "\000\000?\001\001fragment\000\000"}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0,
>>            addr = "\000@?\001\000\000?\001\001matchin"}, {nodeid =
>> 1852121191,
>>            family = 26983, addr = "ne", '\0' <repeats 13 times>},
>> {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}},
>>        member_list_entries = 0, session_id = 0}, {member_list =
>> {{nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0,
>>            addr = "\000\000?\001\001QoS\000\000\000\000\000\000 
>> \000"}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> ---Type <return> to continue, or q <return> to quit---
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0,
>>            addr = "\000\000\000\000\000\000?\001\001HTML pa"}, {
>>            nodeid = 1919251314, family = 0, addr = '\0' <repeats 15
>> times>}, {
>>            nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>} <repeats 21 times>},
>>        member_list_entries = 0, session_id = 0}, {member_list =
>> {{nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>} <repeats 20
>> times>, {
>>            nodeid = 0, family = 0,
>>            addr = "\000@\003", '\0' <repeats 12 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0,
>>            addr = "\000\000\000\000\000\000\000\200\000\000\000\000
>> \000\000\000"}, {nodeid = 0, family = 0, addr = '\0' <repeats 15
>> times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 17664, addr = "+F", '\0' <repeats 13 times>,  
>> "E"}, {
>>            nodeid = 17963, family = 0, addr = '\0' <repeats 15
>> times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}},
>>        member_list_entries = 0, session_id = 1177240832},
>> {member_list = {{
>> ---Type <return> to continue, or q <return> to quit---
>>            nodeid = 0, family = 0,
>>            addr = "\000\000\000\000\000\000\000E+F\000\000\000\000
>> \000"}, {
>>            nodeid = 0, family = 0,
>>            addr = "\000\000\000\000\000\000\000\000\000/\201\000\000
>> \000\000"},
>>          {nodeid = 7936, family = 0,
>>            addr = "\000\000\000\037\000\000\000\000\000\000\000?\001
>> \000\000"},
>>          {nodeid = 587202560, family = 127, addr = '\0' <repeats 15
>> times>}, {
>>            nodeid = 0, family = 6144,
>>            addr = "\000\000\000\000\000\000\000e\025\t\001\000\000 
>> \000
>> \000\034"}, {nodeid = 250, family = 0,
>>            addr = "\000\211\031\b\001\000\000\000\000\000\000\000\000
>> \000\000"}, {nodeid = 733874688, family = 149,
>>            addr = "\000\000\000?6\000\000]3\t\000??\000\000\035"}, {
>>            nodeid = 1023410639, family = 49259,
>>            addr = "?\r\000\000\000^m+\225\000\000\000\000?Ǹ"},
>> {nodeid = 3498,
>>            family = 25600,
>>            addr = "\207?\224\000\000\000\000\000?M\000\000\000\000
>> \000?"}, {
>>            nodeid = 205, family = 0,
>>            addr = "\000\224?L", '\0' <repeats 11 times>}, {nodeid =
>> 2328321792,
>>            family = 58,
>>            addr = "\000\000\000?\021\000\000??\003\000?;\000\000
>> \231"}, {
>>            nodeid = 1023410313, family = 49259,
>>            addr = "?\r\000\000\000??\212:\000\000\000\000?\0003"}, {
>>            nodeid = 3499, family = 6144,
>> ---Type <return> to continue, or q <return> to quit---
>>            addr = "?z:\000\000\000\000\001\000\000\000\016\000\000
>> \000"}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0,
>>            addr = "\000\000\000=k??\r\000\000\000\000\000\000 
>> \000"}, {
>>            nodeid = 889192448, family = 41377,
>>            addr = "?\r", '\0' <repeats 13 times>}, {nodeid = 0,
>> family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = "\000\000\000=k??\r\000\000\000\000\000\000 
>> \000"}, {
>>            nodeid = 889192448, family = 41377,
>>            addr = "?\r", '\0' <repeats 13 times>}, {nodeid = 0,
>> family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>>            addr = "\000\000\000\000\000\000\000\000\000=k??\r 
>> \000"}, {
>>            nodeid = 0, family = 0,
>>            addr = "\000\000\0005???\r\000\000\000\000\000\000 
>> \000"}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}},
>> ---Type <return> to continue, or q <return> to quit---
>>        member_list_entries = 0, session_id = 0}, {member_list =
>> {{nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0,
>>            addr = "\000=k??\r\000\000\000\000\000\000\000\000 
>> \000"}, {
>>            nodeid = 2711696640, family = 3500, addr = '\0' <repeats
>> 15 times>},
>>          {nodeid = 0, family = 0, addr = '\0' <repeats 15 times>},
>> {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0,
>>            addr = "\000\000\000\000\000\000\000=k??\r\000\000 
>> \000"}, {
>>            nodeid = 0, family = 0,
>>            addr = "\0005???\r\000\000\000\000\000\000\000\000 
>> \000"}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0,
>>            addr = "\000\000\000\000\000\000\000=k??\r\000\000 
>> \000"}, {
>>            nodeid = 0, family = 0,
>>            addr = "\0005???\r\000\000\000\000\000\000\000\000 
>> \000"}, {
>>            nodeid = 0, family = 0,
>>            addr = "\000\000\0002\000\000\000\000\000\000\000\002\000
>> \000\000"},
>>          {nodeid = 33554432, family = 0,
>>            addr = '\0' <repeats 13 times>, "2\000"}, {nodeid = 0,
>> family = 0,
>> ---Type <return> to continue, or q <return> to quit---
>>            addr = '\0' <repeats 15 times>, "\001"}, {nodeid = 0,
>> family = 0,
>>            addr = "\000(=", '\0' <repeats 12 times>}, {nodeid =  
>> 4007936,
>>            family = 0, addr = '\0' <repeats 11 times>, "(?-\000"},
>> {nodeid = 0,
>>            family = 0,
>>            addr = "\000\035=\000\000\000\000\000\000\004\001\000
>> \000=k?"}, {
>>            nodeid = 3500, family = 10240,
>>            addr = "?-\000\000\000\000\0005???\r\000\000\000("}, {
>>            nodeid = 11689, family = 0, addr = '\0' <repeats 15
>> times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 13 times>,
>> "=k?"}, {
>>            nodeid = 3500, family = 0,
>>            addr = "\000\000\000\000\000\000\0005???\r\000\000 
>> \000"}, {
>>            nodeid = 0, family = 0,
>>            addr = "\000\000\000\000\000\001\000\000\000\000\000\000
>> \000\000\000"}, {nodeid = 0, family = 0, addr = '\0' <repeats 15
>> times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>>            family = 0, addr = '\0' <repeats 15 times>}, {nodeid =
>> 1023410176,
>>            family = 49259, addr = "?\r", '\0' <repeats 11 times>,
>> "5??"}},
>>        member_list_entries = 3500, session_id = 0}, {member_list = {{
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>>            nodeid = 0, family = 0, addr = '\0' <repeats 11 times>,
>> "=k??\r"}, {
>> ---Type <return> to continue, or q <return> to quit---q
>> noQuit
>> (gdb)
>>
>> ===================================================================== 
>> ===
>> =================================================
>> core2 (10.2.1.7,07-04-aisexec-bug1.core):
>>
>>
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
>> you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for
>> details.
>> This GDB was configured as "i386-marcel-freebsd"...
>> Attaching to program: /log/07-04-aisexec-bug1, process 7
>> ptrace: Invalid argument.
>>
>> warning: core file may not match specified executable file.
>> Core was generated by `aisexec'.
>> Program terminated with signal 6, Aborted.
>> Reading symbols from /usr/lib/libpthread.so.2...done.
>> Loaded symbols for /usr/lib/libpthread.so.2
>> Reading symbols from /lib/libc.so.6...bdone.
>> Loaded symbols for /lib/libc.so.6
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0  0x28187723 in kill () from /lib/libc.so.6
>> [New LWP 100102]
>> (gdb) bt
>> #0  0x28187723 in kill () from /lib/libc.so.6
>> #1  0x280b61da in raise () from /usr/lib/libpthread.so.2
>> #2  0x281863d4 in abort () from /lib/libc.so.6
>> #3  0x28164358 in __assert () from /lib/libc.so.6
>> #4  0x08050aa2 in sq_item_get (sq=0x83c77c0, seq_id=256,
>>      sq_item_out=0x3fbfde10) at sq.h:254
>> #5  0x08052558 in update_aru (instance=0x83c6000) at totemsrp.c:1964
>> #6  0x080527de in orf_token_mcast (instance=0x83c6000,  
>> token=0x3fbfe4c0,
>>      fcc_mcasts_allowed=0, system_from=0x3fbfeb90) at totemsrp.c:2061
>> #7  0x080543f0 in message_handler_orf_token (instance=0x83c6000,
>>      system_from=0x3fbfeb90, msg=0x83e2650, msg_len=88,
>>      endian_conversion_needed=0) at totemsrp.c:2920
>> #8  0x08056dc5 in main_deliver_fn (context=0x83c6000,
>> system_from=0x3fbfeb90,
>>      msg=0x83e2650, msg_len=88) at totemsrp.c:3720
>> #9  0x0804e193 in active_token_recv (instance=0x83b4700,  
>> interface_no=0,
>>      context=0x83c6000, system_from=0x3fbfeb90, msg=0x83e2650,
>> msg_len=88,
>>      token_seqid=272) at totemrrp.c:477
>> #10 0x0804e296 in rrp_deliver_fn (context=0x83b5670,
>> system_from=0x3fbfeb90,
>>      msg=0x83e2650, msg_len=88) at totemrrp.c:537
>> #11 0x0804c3b6 in net_deliver_fn (handle=0, fd=8, revents=1,
>> data=0x83e2000,
>>      prio=0x83c1454) at totemnet.c:687
>> #12 0x0804ab76 in poll_run (handle=0) at aispoll.c:424
>> #13 0x0805fdaf in main (argc=1, argv=0x3fbfee88) at main.c:1317
>> (gdb) frame 4
>> #4  0x08050aa2 in sq_item_get (sq=0x83c77c0, seq_id=256,
>>      sq_item_out=0x3fbfde10) at sq.h:254
>> 254     sq.h: No such file or directory.
>>          in sq.h
>> (gdb) print *sq
>> $1 = {head = 0, size = 256, items = 0x83d7000, items_inuse =  
>> 0x83c4000,
>>    size_per_item = 44, head_seqid = 0, item_count = 256, pos_max =  
>> 255}
>> (gdb) frame 5
>> #5  0x08052558 in update_aru (instance=0x83c6000) at totemsrp.c:1964
>> 1964    totemsrp.c: No such file or directory.
>>          in totemsrp.c
>> (gdb) print *instance
>> $2 = {first_run = 1, fcc_remcast_last = 0, fcc_mcast_last = 0,
>>    fcc_mcast_current = 0, fcc_remcast_current = 0, consensus_list =
>> {{addr = {
>>          nodeid = 117506570, family = 2,
>>          addr = "\n\002\001\a??;?\005\bL1;\b`?"}, set = 1}, {addr = {
>>          nodeid = 100729354, family = 2,
>>          addr = "\n\002\001\006??;?\005\bL1;\b\000?"}, set = 1},
>> {addr = {
>>          nodeid = 4262724106, family = 2,
>>          addr = "\n\002\024???;?\005\bL1;\b\000?"}, set = 1},  
>> {addr = {
>>          nodeid = 84607498, family = 2,
>>          addr = "\n\002\v\005??;?\005\bL1;\b\000?"}, set = 1},  
>> {addr = {
>>          nodeid = 0, family = 0, addr = '\0' <repeats 15 times>},
>>        set = 0} <repeats 28 times>}, consensus_list_entries = 4,
>>    my_proc_list = {{nodeid = 100729354, family = 2,
>>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 117506570,
>>        family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>>        nodeid = 4262724106, family = 2,
>>        addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid =
>> 84607498,
>>        family = 2, addr = "\n\002\v\005", '\0' <repeats 11 times>}, {
>>        nodeid = 0, family = 0,
>>        addr = '\0' <repeats 15 times>} <repeats 28 times>},
>> my_failed_list = {{
>>        nodeid = 4262724106, family = 2, addr = "\n\002\024???;?\005
>> \bL1;\b??"},
>>      {nodeid = 84607498, family = 2,
>>        addr = "\n\002\v\005", '\0' <repeats 11 times>}, {nodeid = 0,
>>        family = 0, addr = '\0' <repeats 15 times>} <repeats 30  
>> times>},
>> ---Type <return> to continue, or q <return> to quit---
>>    my_new_memb_list = {{nodeid = 100729354, family = 2,
>>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 117506570,
>>        family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>>        nodeid = 84607498, family = 2,
>>        addr = "\n\002\v\005", '\0' <repeats 11 times>}, {nodeid =
>> 4262724106,
>>        family = 2, addr = "\n\002\024?", '\0' <repeats 11 times>},
>> {nodeid = 0,
>>        family = 0, addr = '\0' <repeats 15 times>} <repeats 28  
>> times>},
>>    my_trans_memb_list = {{nodeid = 100729354, family = 2,
>>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 117506570,
>>        family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>>        nodeid = 4262724106, family = 2,
>>        addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid = 0,
>> family = 0,
>>        addr = '\0' <repeats 15 times>} <repeats 29 times>},
>> my_memb_list = {{
>>        nodeid = 100729354, family = 2,
>>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 117506570,
>>        family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>>        nodeid = 4262724106, family = 2,
>>        addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid =
>> 4262724106,
>>        family = 2, addr = "\n\002\024?", '\0' <repeats 11 times>},
>> {nodeid = 0,
>>        family = 0, addr = '\0' <repeats 15 times>} <repeats 28  
>> times>},
>>    my_deliver_memb_list = {{nodeid = 100729354, family = 2,
>>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 117506570,
>>        family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>>        nodeid = 4262724106, family = 2,
>> ---Type <return> to continue, or q <return> to quit---
>>        addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid = 0,
>> family = 0,
>>        addr = '\0' <repeats 15 times>} <repeats 29 times>},
>>    my_nodeid_lookup_list = {{nodeid = 117506570, family = 2,
>>        addr = "\n\002\001\a", '\0' <repeats 11 times>}, {nodeid =
>> 100729354,
>>        family = 2, addr = "\n\002\001\006", '\0' <repeats 11  
>> times>}, {
>>        nodeid = 84607498, family = 2,
>>        addr = "\n\002\v\005", '\0' <repeats 11 times>}, {nodeid =
>> 4262724106,
>>        family = 2, addr = "\n\002\024?", '\0' <repeats 11 times>},
>> {nodeid = 0,
>>        family = 0, addr = '\0' <repeats 15 times>} <repeats 28  
>> times>},
>>    my_proc_list_entries = 4, my_failed_list_entries = 0,
>>    my_new_memb_entries = 4, my_trans_memb_entries = 3,
>> my_memb_entries = 3,
>>    my_deliver_memb_entries = 3, my_nodeid_lookup_entries = 4,
>> my_ring_id = {
>>      rep = {nodeid = 100729354, family = 2,
>>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, seq =  
>> 87120},
>>    my_old_ring_id = {rep = {nodeid = 100729354, family = 2,
>>        addr = "\n\002\001\006", '\0' <repeats 11 times>}, seq =  
>> 87116},
>>    my_aru_count = 0, my_merge_detect_timeout_outstanding = 0,
>>    my_last_aru = 272, my_seq_unchanged = 0, my_received_flg = 0,
>>    my_high_seq_received = 272, my_install_seq = 0,
>> my_rotation_counter = 0,
>>    my_set_retrans_flg = 1, my_retrans_flg_count = 0,
>>    my_high_ring_delivered = 0, heartbeat_timeout = 764,
>> new_message_queue = {
>>      head = 108, tail = 50, used = 57, usedhw = 57, size = 195,
>>      items = 0x83e7000, size_per_item = 48, iterator = 0},
>>    retrans_message_queue = {head = 0, tail = 499, used = 0, usedhw  
>> = 0,
>> ---Type <return> to continue, or q <return> to quit---
>>      size = 500, items = 0x83ce000, size_per_item = 48, iterator =  
>> 0},
>>    regular_sort_queue = {head = 0, size = 256, items = 0x83d4000,
>>      items_inuse = 0x83c0c00, size_per_item = 44, head_seqid = 0,
>>      item_count = 256, pos_max = 0}, recovery_sort_queue = {head = 0,
>>      size = 256, items = 0x83d7000, items_inuse = 0x83c4000,
>>      size_per_item = 44, head_seqid = 0, item_count = 256, pos_max =
>> 255},
>>    my_aru = 255, my_high_delivered = 0,
>> token_callback_received_listhead = {
>>      next = 0x83b3440, prev = 0x83b3440},
>> token_callback_sent_listhead = {
>>      next = 0x83c77f0, prev = 0x83c77f0}, orf_token_retransmit =
>> 0x83ca000 "",
>>    orf_token_retransmit_size = 88, my_token_seq = 64,
>>    timer_orf_token_timeout = 0x856d660,
>>    timer_orf_token_retransmit_timeout = 0x85f0d20,
>>    timer_orf_token_hold_retransmit_timeout = 0x0,
>>    timer_merge_detect_timeout = 0x0,
>>    memb_timer_state_gather_join_timeout = 0x0,
>>    memb_timer_state_gather_consensus_timeout = 0x0,
>>    memb_timer_state_commit_timeout = 0x0, timer_heartbeat_timeout =
>> 0x83b33e0,
>>    totemsrp_log_level_security = 65538, totemsrp_log_level_error =
>> 131074,
>>    totemsrp_log_level_warning = 196610, totemsrp_log_level_notice =
>> 262146,
>>    totemsrp_log_level_debug = 327682,
>>    totemsrp_log_printf = 0x805ff60 <internal_log_printf>,
>>    memb_state = MEMB_STATE_RECOVERY, my_id = {nodeid = 117506570,
>> family = 2,
>>      addr = "\n\002\001\a", '\0' <repeats 11 times>}, next_memb = {
>>      nodeid = 84607498, family = 2,
>> ---Type <return> to continue, or q <return> to quit---
>>      addr = "\n\002\v\005", '\0' <repeats 11 times>},
>>    iov_buffer = '\0' <repeats 8999 times>, totemsrp_iov_recv =
>> {iov_base = 0x0,
>>      iov_len = 0}, totemsrp_poll_handle = 0, totemsrp_recv = 0,
>>    mcast_address = {nodeid = 0, family = 2,
>>      addr = "?^\001\003", '\0' <repeats 11 times>},
>>    totemsrp_deliver_fn = 0x8056edc <totemmrp_deliver_fn>,
>>    totemsrp_confchg_fn = 0x8056f10 <totemmrp_confchg_fn>,
>> global_seqno = 246,
>>    my_token_held = 0, token_ring_id_seq = 87120, last_released = 0,
>>    set_aru = 4294967295, old_ring_state_saved = 1, old_ring_state_aru
>> = 0,
>>    old_ring_state_high_seq_received = 0, ring_saved = 1, my_last_seq
>> = 272,
>>    tv_old = {tv_sec = 0, tv_usec = 0}, totemrrp_handle = 0,
>>    totem_config = 0x3fbfed14, use_heartbeat = 1, my_trc = 0,  
>> my_pbl = 0}
>> (gdb)
>> (gdb)
>> (gdb) print range
>> $3 = 17
>>
>>
>> ===================================================================== 
>> ===
>> =================================================
>> core3 (10.2.1.7,07-04-aisexec-bug2.core):
>>
>> F200XA105910400601>gdb 07-04-aisexec-bug2 07-04-aisexec-bug2.core
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
>> you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for
>> details.
>> This GDB was configured as "i386-marcel-freebsd"...
>> Attaching to program: /log/07-04-aisexec-bug2, process 7
>> ptrace: Invalid argument.
>>
>> warning: core file may not match specified executable file.
>> Core was generated by `aisexec'.
>> Program terminated with signal 11, Segmentation fault.
>> Reading symbols from /usr/lib/libpthread.so.2...done.
>> Loaded symbols for /usr/lib/libpthread.so.2
>> Reading symbols from /lib/libc.so.6...done.
>> Loaded symbols for /lib/libc.so.6
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0  0x280f8e21 in memcmp () from /lib/libc.so.6
>> [New LWP 100093]
>> (gdb) bt
>> #0  0x280f8e21 in memcmp () from /lib/libc.so.6
>> #1  0x08057ce7 in group_matches (iovec=0x808b148, iov_len=1,
>>      groups_b=0x83b56d0, group_b_cnt=1, adjust_iovec=0x3fbfc5bc)
>>      at totempg.c:308
>> #2  0x08057b86 in app_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x808b148,
>>      iov_len=1, endian_conversion_required=0) at totempg.c:340
>> #3  0x080579dd in totempg_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x83d5760,
>>      iov_len=1, endian_conversion_required=0) at totempg.c:539
>> #4  0x08056f05 in totemmrp_deliver_fn (source_addr=0x3fbfe9b0,
>>      iovec=0x83d5760, iov_len=1, endian_conversion_required=0) at
>> totemmrp.c:81
>> #5  0x08054cc3 in messages_deliver_to_app (instance=0x83c6000,  
>> skip=0,
>>      end_point=136) at totemsrp.c:3164
>> #6  0x08055065 in message_handler_mcast (instance=0x83c6000,
>>      system_from=0x3fbfeb90, msg=0x83e2650, msg_len=1336,
>>      endian_conversion_needed=0) at totemsrp.c:3301
>> #7  0x08056dc5 in main_deliver_fn (context=0x83c6000,
>> system_from=0x3fbfeb90,
>>      msg=0x83e2650, msg_len=1336) at totemsrp.c:3720
>> #8  0x0804df12 in active_mcast_recv (instance=0x83b4700,
>> context=0x83c6000,
>>      system_from=0x3fbfeb90, msg=0x83e2650, msg_len=1336) at
>> totemrrp.c:393
>> #9  0x0804e2be in rrp_deliver_fn (context=0x83b5670,
>> system_from=0x3fbfeb90,
>>      msg=0x83e2650, msg_len=1336) at totemrrp.c:549
>> #10 0x0804c3b6 in net_deliver_fn (handle=0, fd=6, revents=1,
>> data=0x83e2000,
>>      prio=0x83b4880) at totemnet.c:687
>> #11 0x0804ab76 in poll_run (handle=0) at aispoll.c:424
>> ---Type <return> to continue, or q <return> to quit---frame 1
>> #12 0x0805fdaf in main (argc=1, argv=0x3fbfee88) at main.c:1317
>> (gdb) print *iovec
>> No symbol "iovec" in current context.
>> (gdb) frame 1
>> #1  0x08057ce7 in group_matches (iovec=0x808b148, iov_len=1,
>>      groups_b=0x83b56d0, group_b_cnt=1, adjust_iovec=0x3fbfc5bc)
>>      at totempg.c:308
>> 308     totempg.c: No such file or directory.
>>          in totempg.c
>> (gdb) print *iovec
>> $1 = {iov_base = 0x8525016, iov_len = 72033}
>> (gdb) print *groups_b
>> $2 = {group = 0x807e617, group_len = 1}
>> (gdb)
>> $3 = {group = 0x807e617, group_len = 1}
>> (gdb) print i
>> $4 = 7833
>> (gdb) print j
>> $5 = 0
>> (gdb) print group_len[0]
>> $6 = 41035
>> (gdb)
>>
>> <core-1.debug.patch>





More information about the Openais mailing list