[Openais] PATCH: aisexec leak & crash (updated)
Fabien THOMAS
fabien.thomas at netasq.com
Fri Sep 15 00:39:51 PDT 2006
any news on this ?
Le 30 août 06 à 17:43, Fabien THOMAS a écrit :
> After doing additionnal testing you can find attached the latest
> patch for trunk and whitetank.
> There is 3 problems pending that need a solution (but here i need
> help) two ending by a crash of aisexec.
>
> Now aisexec can continuously run without growing dangerously in
> size but it is still crashing in the reported case.
>
> SOLVED:
> =======
>
> - totemsrp.c: missing free in case of error.
>
> - ipc.c: private_data not freed
>
> - ckpt.c: misplaced call to hdb_destroy in the checkpoint module
> (hdb was destroyed during iterator finalize).
>
> - all: pthread_mutex_destroy is never used:
> i've tried to find the right place to destroy the mutex but it
> needs to be checked by module owner.
>
> - all: pthread_attr_destroy, cond_destroy is never used (but not
> too many leaks here)
>
> conn_info->shared_mutex is allocated by two thread and overwritten
> by one so the mutex is lost:
> here i've just done a quickfix but the real solution need to be
> found later.
>
> PENDING:
> ========
>
> PROBLEM 1: checkpoint iterator
> ------------------
>
> a) iterator are broken when a recovery is started (internal
> structure point to old checkpoint freed by the recovery process)
>
> Muni is aware of the problem but the code need to be reworked.
>
> b) ckpt_lib_exit_fn does not free iterator (there is a block of
> TODO in the code).
> the problem with this leak is that each time i run a client aisexec
> will grow.
>
> PROBLEM 2: reference to conn_info after free
> -----------------
>
> race condition somewhere during connection close: (i've this while
> breaking the client application)
>
> it seems that checkpoint module access conn_info structure after
> the structure is destroyed.
> ==18010== Invalid read of size 4
> ==18010== at 0x8064248: libais_connection_active (ipc.c:332)
> ==18010== by 0x80655D1: openais_conn_send_response (ipc.c:963)
> struct conn_info *conn_info = (struct conn_info *)conn;
>
> if (conn_info == NULL) {
> return -1;
> }
> here ==> if (!libais_connection_active (conn_info)) {
> return (-1);
> }
>
> ==18010== by 0x8072CAF:
> message_handler_req_exec_ckpt_sectionread (ckpt.c:3337)
>
> /*
> * Write read response to CKPT library
> */
> error_exit:
> if (message_source_is_local(&req_exec_ckpt_sectionread-
> >source)) {
> res_lib_ckpt_sectionread.header.size = sizeof
> (struct res_lib_ckpt_sect
> res_lib_ckpt_sectionread.header.id =
> MESSAGE_RES_CKPT_CHECKPOINT_SECTIO
> res_lib_ckpt_sectionread.header.error = error;
>
> if (section_size != 0) {
> res_lib_ckpt_sectionread.data_read =
> section_size;
> }
>
> here ==> openais_conn_send_response (
> req_exec_ckpt_sectionread->source.conn,
> &res_lib_ckpt_sectionread,
> sizeof (struct res_lib_ckpt_sectionread));
>
>
> ==18010== by 0x806138B: deliver_fn (main.c:357)
> ==18010== by 0x805B939: app_deliver_fn (totempg.c:395)
> ==18010== by 0x805B70D: totempg_deliver_fn (totempg.c:553)
> ==18010== by 0x805AAC2: totemmrp_deliver_fn (totemmrp.c:81)
> ==18010== by 0x805843D: messages_deliver_to_app (totemsrp.c:3439)
> ==18010== by 0x80580B8: message_handler_orf_token (totemsrp.c:3318)
> ==18010== by 0x805A8EA: main_deliver_fn (totemsrp.c:4023)
> ==18010== by 0x804EBC9: none_token_recv (totemrrp.c:506)
> ==18010== by 0x80504BB: rrp_deliver_fn (totemrrp.c:1308)
> ==18010== by 0x804CC2B: net_deliver_fn (totemnet.c:679)
> ==18010== by 0x804B17C: poll_run (aispoll.c:402)
> ==18010== by 0x8061C8F: main (main.c:594)
> ==18010== Address 0x41D51B8 is 8 bytes inside a block of size 188
> free'd
> ==18010== at 0x401CFCF: free (vg_replace_malloc.c:235)
> ==18010== by 0x80640D5: conn_info_destroy (ipc.c:327)
> ==18010== by 0x8064524: prioritized_poll_thread (ipc.c:456)
> ==18010== by 0x4032340: start_thread (in /lib/tls/i686/cmov/
> libpthread-2.3.6.so)
> ==18010== by 0x41084ED: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)
> ==18010==
>
> PROBLEM 3:
> ------------------
>
> unsolved leak:
>
> totemsrp.c: there is 2 TODO LEAK that really leak but i can figure
> out where this block should be freed (iovec and mcast).
>
>
> <patch-leak>
> _______________________________________________
> Openais mailing list
> Openais at lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/openais
More information about the Openais
mailing list