[Openais] Re: new batch of bugs - was defect 1170 - assert in
memb_state_recover_enter
Fabien THOMAS
fabien.thomas at netasq.com
Mon Apr 24 05:11:13 PDT 2006
I'm back from holiday :)
> I've been swamped lately but have had a chance to spend a day studying
> these problems.
>
> When you run in debug mode could you run with "secauth" turned to on?
> This will verify that messages are being properly received over the
> network.
>
ok
> For core 1 I'm not sure how the max sessions is being set to such a
> large value. This is the cause of the segfault.
>
> [i].ykd_state.ambiguous_sessions_entries
> $3 = 1212459015
>
> Could you try the attached core-1-debug.patch debug patch against
> trunk
> in your environment? We are looking for ambiguous sessions that look
> large. Logs of the output of stdout are helpful in this case.
>
ok i will update to latest trunk and do my stress test with the patch
> I have duplicated core 2 and understand the problem. I will work
> out a
> patch this week.
>
> Are you seeing alot of reconfigurations during normal running
> operation?
no but i'm doing forced reconfiguration because i kill aisexec at
random times on each node.
> Are you in a cross endian environment?
>
no
> For core 3 I believe this problem is fixed by the strict aliasing
> patches that went into the tree. I ran into this same problem on fc 5
> with gcc 4.1. Which version of gcc are you using?
$gcc -v
Using built-in specs.
Configured with: FreeBSD/i386 system compiler
Thread model: posix
gcc version 3.4.4 [FreeBSD] 20050518
i'm also use DEBUG build to have better debugging support (-O0).
>
> Regards
> -steve
>
> For core 2 below On Fri, 2006-04-07 at 10:17 +0200, Fabien THOMAS
> wrote:
>> i've set the MTU to 1400 and it seems to solve the problem i have
>> before but that still doesnt works: i've 3 differents cores now :(
>>
>> i've 4 nodes with random kill on aisexec on 3 nodes and one that
>> randomly access checkpoint data.
>> in less that 3 minutes 2 of the 4 nodes crashed at least 3 times.
>>
>> find the 3 cores information attached:
>>
>> core1 (10.2.20.254,07-04-aisexec-bug3.core):
>>
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
>> you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB. Type "show warranty" for
>> details.
>> This GDB was configured as "i386-marcel-freebsd"...
>> Core was generated by `aisexec'.
>> Program terminated with signal 11, Segmentation fault.
>> Reading symbols from /usr/lib/libpthread.so.2...done.
>> Loaded symbols for /usr/lib/libpthread.so.2
>> Reading symbols from /lib/libc.so.6...done.
>> Loaded symbols for /lib/libc.so.6
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0 0x281893e6 in memcpy () from /lib/libc.so.6
>> [New LWP 100061]
>> (gdb) bt
>> #0 0x281893e6 in memcpy () from /lib/libc.so.6
>> #1 0x08060ce6 in compute () at ykd.c:238
>> #2 0x080612b5 in ykd_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x3fbfc5c0,
>> iov_len=1, endian_conversion_required=0) at ykd.c:401
>> #3 0x08057bba in app_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x808b148,
>> iov_len=1, endian_conversion_required=0) at totempg.c:343
>> #4 0x080579dd in totempg_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x83d5658,
>> iov_len=1, endian_conversion_required=0) at totempg.c:539
>> #5 0x08056f05 in totemmrp_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x83d5658, iov_len=1, endian_conversion_required=0) at
>> totemmrp.c:81
>> #6 0x08054cc3 in messages_deliver_to_app (instance=0x83c6000,
>> skip=0,
>> end_point=386) at totemsrp.c:3164
>> #7 0x08055065 in message_handler_mcast (instance=0x83c6000,
>> system_from=0x3fbfeb90, msg=0x83e2650, msg_len=93,
>> endian_conversion_needed=0) at totemsrp.c:3301
>> #8 0x08056dc5 in main_deliver_fn (context=0x83c6000,
>> system_from=0x3fbfeb90,
>> msg=0x83e2650, msg_len=93) at totemsrp.c:3720
>> #9 0x0804df12 in active_mcast_recv (instance=0x83b4700,
>> context=0x83c6000,
>> system_from=0x3fbfeb90, msg=0x83e2650, msg_len=93) at
>> totemrrp.c:
>> 393
>> #10 0x0804e2be in rrp_deliver_fn (context=0x83b5670,
>> system_from=0x3fbfeb90,
>> msg=0x83e2650, msg_len=93) at totemrrp.c:549
>> #11 0x0804c3b6 in net_deliver_fn (handle=0, fd=6, revents=1,
>> data=0x83e2000,
>> prio=0x83c1440) at totemnet.c:687
>> #12 0x0804ab76 in poll_run (handle=0) at aispoll.c:424
>> ---Type <return> to continue, or q <return> to quit---
>> #13 0x0805fdaf in main (argc=1, argv=0x3fbfee88) at main.c:1317
>> (gdb) frame 1
>> #1 0x08060ce6 in compute () at ykd.c:238
>> 238 ykd.c: No such file or directory.
>> in ykd.c
>> (gdb) print i
>> $1 = 0
>> (gdb) print j
>> $2 = 3375
>> (gdb) print state_received_process
>> [i].ykd_state.ambiguous_sessions_entries
>> $3 = 1212459015
>> (gdb) print state_received_process[i].ykd_state
>> $4 = {last_primary = {member_list = {{nodeid = 34209795, family =
>> 1537,
>> addr = "\002\000\n\002\001\006\000\000\000\000\000\000\000
>> \000\000"}, {
>> nodeid = 0, family = 0,
>> addr = "\000\000\000\000\005\000stat\000\000\000\000
>> \000"}, {
>> nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>} <repeats 11 times>, {nodeid
>> = 0,
>> family = 19,
>> addr = "\000\000\000\000\000\000\000\000\000\000\b\205\000
>> \000F2"}, {
>> nodeid = 1096298544, family = 12337,
>> addr = "5910400601\000\001\000\000\000"}, {nodeid =
>> 1526726789,
>> family = 19579, addr = "?\r", '\0' <repeats 13 times>},
>> {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 3904256,
>> family = 0,
>> addr = "\000\000\000\000?\001\000\000?\003\002dynam"}, {
>> nodeid = 1830839145, family = 28005,
>> addr = "ory", '\0' <repeats 12 times>}, {nodeid = 0, family
>> = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 12 times>, "?\001\001t"}, {nodeid =
>> 1869639013,
>> family = 24946, addr = "ry storage\000\000\000\000\000"},
>> {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> ---Type <return> to continue, or q <return> to quit---
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 1929445880,
>> family = 29283,
>> addr = "ipting engine\000\000"}}, member_list_entries = 0,
>> session_id = 0}, last_formed = {{member_list = {{nodeid = 0,
>> family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = "\000\000\000\000\000\000\000?\001\000\000\000?\200
>> \000p"}, {
>> nodeid = 1768387948, family = 110, addr = '\0' <repeats 15
>> times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 1677827264, family = 29793,
>> addr = "a tracking\000\000\000\000\000"}, {nodeid = 0,
>> family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family =
>> 504,
>> addr = "\001filter\000\000\000\000\000\000\000\000"},
>> {nodeid = 0,
>> ---Type <return> to continue, or q <return> to quit---
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = "\000\000?\001\001fragment\000\000"}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0,
>> addr = "\000@?\001\000\000?\001\001matchin"}, {nodeid =
>> 1852121191,
>> family = 26983, addr = "ne", '\0' <repeats 13 times>},
>> {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}},
>> member_list_entries = 0, session_id = 0}, {member_list =
>> {{nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0,
>> addr = "\000\000?\001\001QoS\000\000\000\000\000\000
>> \000"}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> ---Type <return> to continue, or q <return> to quit---
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0,
>> addr = "\000\000\000\000\000\000?\001\001HTML pa"}, {
>> nodeid = 1919251314, family = 0, addr = '\0' <repeats 15
>> times>}, {
>> nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>} <repeats 21 times>},
>> member_list_entries = 0, session_id = 0}, {member_list =
>> {{nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>} <repeats 20
>> times>, {
>> nodeid = 0, family = 0,
>> addr = "\000@\003", '\0' <repeats 12 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0,
>> addr = "\000\000\000\000\000\000\000\200\000\000\000\000
>> \000\000\000"}, {nodeid = 0, family = 0, addr = '\0' <repeats 15
>> times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 17664, addr = "+F", '\0' <repeats 13 times>,
>> "E"}, {
>> nodeid = 17963, family = 0, addr = '\0' <repeats 15
>> times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}},
>> member_list_entries = 0, session_id = 1177240832},
>> {member_list = {{
>> ---Type <return> to continue, or q <return> to quit---
>> nodeid = 0, family = 0,
>> addr = "\000\000\000\000\000\000\000E+F\000\000\000\000
>> \000"}, {
>> nodeid = 0, family = 0,
>> addr = "\000\000\000\000\000\000\000\000\000/\201\000\000
>> \000\000"},
>> {nodeid = 7936, family = 0,
>> addr = "\000\000\000\037\000\000\000\000\000\000\000?\001
>> \000\000"},
>> {nodeid = 587202560, family = 127, addr = '\0' <repeats 15
>> times>}, {
>> nodeid = 0, family = 6144,
>> addr = "\000\000\000\000\000\000\000e\025\t\001\000\000
>> \000
>> \000\034"}, {nodeid = 250, family = 0,
>> addr = "\000\211\031\b\001\000\000\000\000\000\000\000\000
>> \000\000"}, {nodeid = 733874688, family = 149,
>> addr = "\000\000\000?6\000\000]3\t\000??\000\000\035"}, {
>> nodeid = 1023410639, family = 49259,
>> addr = "?\r\000\000\000^m+\225\000\000\000\000?Ǹ"},
>> {nodeid = 3498,
>> family = 25600,
>> addr = "\207?\224\000\000\000\000\000?M\000\000\000\000
>> \000?"}, {
>> nodeid = 205, family = 0,
>> addr = "\000\224?L", '\0' <repeats 11 times>}, {nodeid =
>> 2328321792,
>> family = 58,
>> addr = "\000\000\000?\021\000\000??\003\000?;\000\000
>> \231"}, {
>> nodeid = 1023410313, family = 49259,
>> addr = "?\r\000\000\000??\212:\000\000\000\000?\0003"}, {
>> nodeid = 3499, family = 6144,
>> ---Type <return> to continue, or q <return> to quit---
>> addr = "?z:\000\000\000\000\001\000\000\000\016\000\000
>> \000"}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0,
>> addr = "\000\000\000=k??\r\000\000\000\000\000\000
>> \000"}, {
>> nodeid = 889192448, family = 41377,
>> addr = "?\r", '\0' <repeats 13 times>}, {nodeid = 0,
>> family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = "\000\000\000=k??\r\000\000\000\000\000\000
>> \000"}, {
>> nodeid = 889192448, family = 41377,
>> addr = "?\r", '\0' <repeats 13 times>}, {nodeid = 0,
>> family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, {nodeid = 0, family = 0,
>> addr = "\000\000\000\000\000\000\000\000\000=k??\r
>> \000"}, {
>> nodeid = 0, family = 0,
>> addr = "\000\000\0005???\r\000\000\000\000\000\000
>> \000"}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}},
>> ---Type <return> to continue, or q <return> to quit---
>> member_list_entries = 0, session_id = 0}, {member_list =
>> {{nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0,
>> addr = "\000=k??\r\000\000\000\000\000\000\000\000
>> \000"}, {
>> nodeid = 2711696640, family = 3500, addr = '\0' <repeats
>> 15 times>},
>> {nodeid = 0, family = 0, addr = '\0' <repeats 15 times>},
>> {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0,
>> addr = "\000\000\000\000\000\000\000=k??\r\000\000
>> \000"}, {
>> nodeid = 0, family = 0,
>> addr = "\0005???\r\000\000\000\000\000\000\000\000
>> \000"}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0,
>> addr = "\000\000\000\000\000\000\000=k??\r\000\000
>> \000"}, {
>> nodeid = 0, family = 0,
>> addr = "\0005???\r\000\000\000\000\000\000\000\000
>> \000"}, {
>> nodeid = 0, family = 0,
>> addr = "\000\000\0002\000\000\000\000\000\000\000\002\000
>> \000\000"},
>> {nodeid = 33554432, family = 0,
>> addr = '\0' <repeats 13 times>, "2\000"}, {nodeid = 0,
>> family = 0,
>> ---Type <return> to continue, or q <return> to quit---
>> addr = '\0' <repeats 15 times>, "\001"}, {nodeid = 0,
>> family = 0,
>> addr = "\000(=", '\0' <repeats 12 times>}, {nodeid =
>> 4007936,
>> family = 0, addr = '\0' <repeats 11 times>, "(?-\000"},
>> {nodeid = 0,
>> family = 0,
>> addr = "\000\035=\000\000\000\000\000\000\004\001\000
>> \000=k?"}, {
>> nodeid = 3500, family = 10240,
>> addr = "?-\000\000\000\000\0005???\r\000\000\000("}, {
>> nodeid = 11689, family = 0, addr = '\0' <repeats 15
>> times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 13 times>,
>> "=k?"}, {
>> nodeid = 3500, family = 0,
>> addr = "\000\000\000\000\000\000\0005???\r\000\000
>> \000"}, {
>> nodeid = 0, family = 0,
>> addr = "\000\000\000\000\000\001\000\000\000\000\000\000
>> \000\000\000"}, {nodeid = 0, family = 0, addr = '\0' <repeats 15
>> times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>}, {nodeid =
>> 1023410176,
>> family = 49259, addr = "?\r", '\0' <repeats 11 times>,
>> "5??"}},
>> member_list_entries = 3500, session_id = 0}, {member_list = {{
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 11 times>,
>> "=k??\r"}, {
>> ---Type <return> to continue, or q <return> to quit---q
>> noQuit
>> (gdb)
>>
>> =====================================================================
>> ===
>> =================================================
>> core2 (10.2.1.7,07-04-aisexec-bug1.core):
>>
>>
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
>> you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB. Type "show warranty" for
>> details.
>> This GDB was configured as "i386-marcel-freebsd"...
>> Attaching to program: /log/07-04-aisexec-bug1, process 7
>> ptrace: Invalid argument.
>>
>> warning: core file may not match specified executable file.
>> Core was generated by `aisexec'.
>> Program terminated with signal 6, Aborted.
>> Reading symbols from /usr/lib/libpthread.so.2...done.
>> Loaded symbols for /usr/lib/libpthread.so.2
>> Reading symbols from /lib/libc.so.6...bdone.
>> Loaded symbols for /lib/libc.so.6
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0 0x28187723 in kill () from /lib/libc.so.6
>> [New LWP 100102]
>> (gdb) bt
>> #0 0x28187723 in kill () from /lib/libc.so.6
>> #1 0x280b61da in raise () from /usr/lib/libpthread.so.2
>> #2 0x281863d4 in abort () from /lib/libc.so.6
>> #3 0x28164358 in __assert () from /lib/libc.so.6
>> #4 0x08050aa2 in sq_item_get (sq=0x83c77c0, seq_id=256,
>> sq_item_out=0x3fbfde10) at sq.h:254
>> #5 0x08052558 in update_aru (instance=0x83c6000) at totemsrp.c:1964
>> #6 0x080527de in orf_token_mcast (instance=0x83c6000,
>> token=0x3fbfe4c0,
>> fcc_mcasts_allowed=0, system_from=0x3fbfeb90) at totemsrp.c:2061
>> #7 0x080543f0 in message_handler_orf_token (instance=0x83c6000,
>> system_from=0x3fbfeb90, msg=0x83e2650, msg_len=88,
>> endian_conversion_needed=0) at totemsrp.c:2920
>> #8 0x08056dc5 in main_deliver_fn (context=0x83c6000,
>> system_from=0x3fbfeb90,
>> msg=0x83e2650, msg_len=88) at totemsrp.c:3720
>> #9 0x0804e193 in active_token_recv (instance=0x83b4700,
>> interface_no=0,
>> context=0x83c6000, system_from=0x3fbfeb90, msg=0x83e2650,
>> msg_len=88,
>> token_seqid=272) at totemrrp.c:477
>> #10 0x0804e296 in rrp_deliver_fn (context=0x83b5670,
>> system_from=0x3fbfeb90,
>> msg=0x83e2650, msg_len=88) at totemrrp.c:537
>> #11 0x0804c3b6 in net_deliver_fn (handle=0, fd=8, revents=1,
>> data=0x83e2000,
>> prio=0x83c1454) at totemnet.c:687
>> #12 0x0804ab76 in poll_run (handle=0) at aispoll.c:424
>> #13 0x0805fdaf in main (argc=1, argv=0x3fbfee88) at main.c:1317
>> (gdb) frame 4
>> #4 0x08050aa2 in sq_item_get (sq=0x83c77c0, seq_id=256,
>> sq_item_out=0x3fbfde10) at sq.h:254
>> 254 sq.h: No such file or directory.
>> in sq.h
>> (gdb) print *sq
>> $1 = {head = 0, size = 256, items = 0x83d7000, items_inuse =
>> 0x83c4000,
>> size_per_item = 44, head_seqid = 0, item_count = 256, pos_max =
>> 255}
>> (gdb) frame 5
>> #5 0x08052558 in update_aru (instance=0x83c6000) at totemsrp.c:1964
>> 1964 totemsrp.c: No such file or directory.
>> in totemsrp.c
>> (gdb) print *instance
>> $2 = {first_run = 1, fcc_remcast_last = 0, fcc_mcast_last = 0,
>> fcc_mcast_current = 0, fcc_remcast_current = 0, consensus_list =
>> {{addr = {
>> nodeid = 117506570, family = 2,
>> addr = "\n\002\001\a??;?\005\bL1;\b`?"}, set = 1}, {addr = {
>> nodeid = 100729354, family = 2,
>> addr = "\n\002\001\006??;?\005\bL1;\b\000?"}, set = 1},
>> {addr = {
>> nodeid = 4262724106, family = 2,
>> addr = "\n\002\024???;?\005\bL1;\b\000?"}, set = 1},
>> {addr = {
>> nodeid = 84607498, family = 2,
>> addr = "\n\002\v\005??;?\005\bL1;\b\000?"}, set = 1},
>> {addr = {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>},
>> set = 0} <repeats 28 times>}, consensus_list_entries = 4,
>> my_proc_list = {{nodeid = 100729354, family = 2,
>> addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 117506570,
>> family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>> nodeid = 4262724106, family = 2,
>> addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid =
>> 84607498,
>> family = 2, addr = "\n\002\v\005", '\0' <repeats 11 times>}, {
>> nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>} <repeats 28 times>},
>> my_failed_list = {{
>> nodeid = 4262724106, family = 2, addr = "\n\002\024???;?\005
>> \bL1;\b??"},
>> {nodeid = 84607498, family = 2,
>> addr = "\n\002\v\005", '\0' <repeats 11 times>}, {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>} <repeats 30
>> times>},
>> ---Type <return> to continue, or q <return> to quit---
>> my_new_memb_list = {{nodeid = 100729354, family = 2,
>> addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 117506570,
>> family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>> nodeid = 84607498, family = 2,
>> addr = "\n\002\v\005", '\0' <repeats 11 times>}, {nodeid =
>> 4262724106,
>> family = 2, addr = "\n\002\024?", '\0' <repeats 11 times>},
>> {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>} <repeats 28
>> times>},
>> my_trans_memb_list = {{nodeid = 100729354, family = 2,
>> addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 117506570,
>> family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>> nodeid = 4262724106, family = 2,
>> addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid = 0,
>> family = 0,
>> addr = '\0' <repeats 15 times>} <repeats 29 times>},
>> my_memb_list = {{
>> nodeid = 100729354, family = 2,
>> addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 117506570,
>> family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>> nodeid = 4262724106, family = 2,
>> addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid =
>> 4262724106,
>> family = 2, addr = "\n\002\024?", '\0' <repeats 11 times>},
>> {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>} <repeats 28
>> times>},
>> my_deliver_memb_list = {{nodeid = 100729354, family = 2,
>> addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 117506570,
>> family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>}, {
>> nodeid = 4262724106, family = 2,
>> ---Type <return> to continue, or q <return> to quit---
>> addr = "\n\002\024?", '\0' <repeats 11 times>}, {nodeid = 0,
>> family = 0,
>> addr = '\0' <repeats 15 times>} <repeats 29 times>},
>> my_nodeid_lookup_list = {{nodeid = 117506570, family = 2,
>> addr = "\n\002\001\a", '\0' <repeats 11 times>}, {nodeid =
>> 100729354,
>> family = 2, addr = "\n\002\001\006", '\0' <repeats 11
>> times>}, {
>> nodeid = 84607498, family = 2,
>> addr = "\n\002\v\005", '\0' <repeats 11 times>}, {nodeid =
>> 4262724106,
>> family = 2, addr = "\n\002\024?", '\0' <repeats 11 times>},
>> {nodeid = 0,
>> family = 0, addr = '\0' <repeats 15 times>} <repeats 28
>> times>},
>> my_proc_list_entries = 4, my_failed_list_entries = 0,
>> my_new_memb_entries = 4, my_trans_memb_entries = 3,
>> my_memb_entries = 3,
>> my_deliver_memb_entries = 3, my_nodeid_lookup_entries = 4,
>> my_ring_id = {
>> rep = {nodeid = 100729354, family = 2,
>> addr = "\n\002\001\006", '\0' <repeats 11 times>}, seq =
>> 87120},
>> my_old_ring_id = {rep = {nodeid = 100729354, family = 2,
>> addr = "\n\002\001\006", '\0' <repeats 11 times>}, seq =
>> 87116},
>> my_aru_count = 0, my_merge_detect_timeout_outstanding = 0,
>> my_last_aru = 272, my_seq_unchanged = 0, my_received_flg = 0,
>> my_high_seq_received = 272, my_install_seq = 0,
>> my_rotation_counter = 0,
>> my_set_retrans_flg = 1, my_retrans_flg_count = 0,
>> my_high_ring_delivered = 0, heartbeat_timeout = 764,
>> new_message_queue = {
>> head = 108, tail = 50, used = 57, usedhw = 57, size = 195,
>> items = 0x83e7000, size_per_item = 48, iterator = 0},
>> retrans_message_queue = {head = 0, tail = 499, used = 0, usedhw
>> = 0,
>> ---Type <return> to continue, or q <return> to quit---
>> size = 500, items = 0x83ce000, size_per_item = 48, iterator =
>> 0},
>> regular_sort_queue = {head = 0, size = 256, items = 0x83d4000,
>> items_inuse = 0x83c0c00, size_per_item = 44, head_seqid = 0,
>> item_count = 256, pos_max = 0}, recovery_sort_queue = {head = 0,
>> size = 256, items = 0x83d7000, items_inuse = 0x83c4000,
>> size_per_item = 44, head_seqid = 0, item_count = 256, pos_max =
>> 255},
>> my_aru = 255, my_high_delivered = 0,
>> token_callback_received_listhead = {
>> next = 0x83b3440, prev = 0x83b3440},
>> token_callback_sent_listhead = {
>> next = 0x83c77f0, prev = 0x83c77f0}, orf_token_retransmit =
>> 0x83ca000 "",
>> orf_token_retransmit_size = 88, my_token_seq = 64,
>> timer_orf_token_timeout = 0x856d660,
>> timer_orf_token_retransmit_timeout = 0x85f0d20,
>> timer_orf_token_hold_retransmit_timeout = 0x0,
>> timer_merge_detect_timeout = 0x0,
>> memb_timer_state_gather_join_timeout = 0x0,
>> memb_timer_state_gather_consensus_timeout = 0x0,
>> memb_timer_state_commit_timeout = 0x0, timer_heartbeat_timeout =
>> 0x83b33e0,
>> totemsrp_log_level_security = 65538, totemsrp_log_level_error =
>> 131074,
>> totemsrp_log_level_warning = 196610, totemsrp_log_level_notice =
>> 262146,
>> totemsrp_log_level_debug = 327682,
>> totemsrp_log_printf = 0x805ff60 <internal_log_printf>,
>> memb_state = MEMB_STATE_RECOVERY, my_id = {nodeid = 117506570,
>> family = 2,
>> addr = "\n\002\001\a", '\0' <repeats 11 times>}, next_memb = {
>> nodeid = 84607498, family = 2,
>> ---Type <return> to continue, or q <return> to quit---
>> addr = "\n\002\v\005", '\0' <repeats 11 times>},
>> iov_buffer = '\0' <repeats 8999 times>, totemsrp_iov_recv =
>> {iov_base = 0x0,
>> iov_len = 0}, totemsrp_poll_handle = 0, totemsrp_recv = 0,
>> mcast_address = {nodeid = 0, family = 2,
>> addr = "?^\001\003", '\0' <repeats 11 times>},
>> totemsrp_deliver_fn = 0x8056edc <totemmrp_deliver_fn>,
>> totemsrp_confchg_fn = 0x8056f10 <totemmrp_confchg_fn>,
>> global_seqno = 246,
>> my_token_held = 0, token_ring_id_seq = 87120, last_released = 0,
>> set_aru = 4294967295, old_ring_state_saved = 1, old_ring_state_aru
>> = 0,
>> old_ring_state_high_seq_received = 0, ring_saved = 1, my_last_seq
>> = 272,
>> tv_old = {tv_sec = 0, tv_usec = 0}, totemrrp_handle = 0,
>> totem_config = 0x3fbfed14, use_heartbeat = 1, my_trc = 0,
>> my_pbl = 0}
>> (gdb)
>> (gdb)
>> (gdb) print range
>> $3 = 17
>>
>>
>> =====================================================================
>> ===
>> =================================================
>> core3 (10.2.1.7,07-04-aisexec-bug2.core):
>>
>> F200XA105910400601>gdb 07-04-aisexec-bug2 07-04-aisexec-bug2.core
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
>> you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB. Type "show warranty" for
>> details.
>> This GDB was configured as "i386-marcel-freebsd"...
>> Attaching to program: /log/07-04-aisexec-bug2, process 7
>> ptrace: Invalid argument.
>>
>> warning: core file may not match specified executable file.
>> Core was generated by `aisexec'.
>> Program terminated with signal 11, Segmentation fault.
>> Reading symbols from /usr/lib/libpthread.so.2...done.
>> Loaded symbols for /usr/lib/libpthread.so.2
>> Reading symbols from /lib/libc.so.6...done.
>> Loaded symbols for /lib/libc.so.6
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0 0x280f8e21 in memcmp () from /lib/libc.so.6
>> [New LWP 100093]
>> (gdb) bt
>> #0 0x280f8e21 in memcmp () from /lib/libc.so.6
>> #1 0x08057ce7 in group_matches (iovec=0x808b148, iov_len=1,
>> groups_b=0x83b56d0, group_b_cnt=1, adjust_iovec=0x3fbfc5bc)
>> at totempg.c:308
>> #2 0x08057b86 in app_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x808b148,
>> iov_len=1, endian_conversion_required=0) at totempg.c:340
>> #3 0x080579dd in totempg_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x83d5760,
>> iov_len=1, endian_conversion_required=0) at totempg.c:539
>> #4 0x08056f05 in totemmrp_deliver_fn (source_addr=0x3fbfe9b0,
>> iovec=0x83d5760, iov_len=1, endian_conversion_required=0) at
>> totemmrp.c:81
>> #5 0x08054cc3 in messages_deliver_to_app (instance=0x83c6000,
>> skip=0,
>> end_point=136) at totemsrp.c:3164
>> #6 0x08055065 in message_handler_mcast (instance=0x83c6000,
>> system_from=0x3fbfeb90, msg=0x83e2650, msg_len=1336,
>> endian_conversion_needed=0) at totemsrp.c:3301
>> #7 0x08056dc5 in main_deliver_fn (context=0x83c6000,
>> system_from=0x3fbfeb90,
>> msg=0x83e2650, msg_len=1336) at totemsrp.c:3720
>> #8 0x0804df12 in active_mcast_recv (instance=0x83b4700,
>> context=0x83c6000,
>> system_from=0x3fbfeb90, msg=0x83e2650, msg_len=1336) at
>> totemrrp.c:393
>> #9 0x0804e2be in rrp_deliver_fn (context=0x83b5670,
>> system_from=0x3fbfeb90,
>> msg=0x83e2650, msg_len=1336) at totemrrp.c:549
>> #10 0x0804c3b6 in net_deliver_fn (handle=0, fd=6, revents=1,
>> data=0x83e2000,
>> prio=0x83b4880) at totemnet.c:687
>> #11 0x0804ab76 in poll_run (handle=0) at aispoll.c:424
>> ---Type <return> to continue, or q <return> to quit---frame 1
>> #12 0x0805fdaf in main (argc=1, argv=0x3fbfee88) at main.c:1317
>> (gdb) print *iovec
>> No symbol "iovec" in current context.
>> (gdb) frame 1
>> #1 0x08057ce7 in group_matches (iovec=0x808b148, iov_len=1,
>> groups_b=0x83b56d0, group_b_cnt=1, adjust_iovec=0x3fbfc5bc)
>> at totempg.c:308
>> 308 totempg.c: No such file or directory.
>> in totempg.c
>> (gdb) print *iovec
>> $1 = {iov_base = 0x8525016, iov_len = 72033}
>> (gdb) print *groups_b
>> $2 = {group = 0x807e617, group_len = 1}
>> (gdb)
>> $3 = {group = 0x807e617, group_len = 1}
>> (gdb) print i
>> $4 = 7833
>> (gdb) print j
>> $5 = 0
>> (gdb) print group_len[0]
>> $6 = 41035
>> (gdb)
>>
>> <core-1.debug.patch>
More information about the Openais
mailing list