[Openais] PATCH: aisexec leak & crash (updated)

Fabien THOMAS fabien.thomas at netasq.com
Fri Sep 15 00:39:51 PDT 2006


any news on this ?

Le 30 août 06 à 17:43, Fabien THOMAS a écrit :

> After doing additionnal testing you can find attached the latest  
> patch for trunk and whitetank.
> There is 3 problems pending that need a solution (but here i need  
> help) two ending by a crash of aisexec.
>
> Now aisexec can continuously run without growing dangerously in  
> size but it is still crashing in the reported case.
>
> SOLVED:
> =======
>
> - totemsrp.c: missing free in case of error.
>
> - ipc.c: private_data not freed
>
> - ckpt.c: misplaced call to hdb_destroy in the checkpoint module  
> (hdb was destroyed during iterator finalize).
>
> - all: pthread_mutex_destroy is never used:
> i've tried to find the right place to destroy the mutex but it  
> needs to be checked by module owner.
>
> - all: pthread_attr_destroy, cond_destroy is never used (but not  
> too many leaks here)
>
> conn_info->shared_mutex is allocated by two thread and overwritten  
> by one so the mutex is lost:
> here i've just done a quickfix but the real solution need to be  
> found later.
>
> PENDING:
> ========
>
> PROBLEM 1: checkpoint iterator
> ------------------
>
> a) iterator are broken when a recovery is started (internal  
> structure point to old checkpoint freed by the recovery process)
>
> Muni is aware of the problem but the code need to be reworked.
>
> b) ckpt_lib_exit_fn does not free iterator (there is a block of  
> TODO in the code).
> the problem with this leak is that each time i run a client aisexec  
> will grow.
>
> PROBLEM 2: reference to conn_info after free
> -----------------
>
> race condition somewhere during connection close: (i've this while  
> breaking the client application)
>
> it seems that checkpoint module access conn_info structure after  
> the structure is destroyed.
> ==18010== Invalid read of size 4
> ==18010==    at 0x8064248: libais_connection_active (ipc.c:332)
> ==18010==    by 0x80655D1: openais_conn_send_response (ipc.c:963)
>       struct conn_info *conn_info = (struct conn_info *)conn;
>
>         if (conn_info == NULL) {
>                 return -1;
>         }
> here ==>        if (!libais_connection_active (conn_info)) {
>                 return (-1);
>         }
>
> ==18010==    by 0x8072CAF:  
> message_handler_req_exec_ckpt_sectionread (ckpt.c:3337)
>
>        /*
>          * Write read response to CKPT library
>          */
> error_exit:
>         if (message_source_is_local(&req_exec_ckpt_sectionread- 
> >source)) {
>                 res_lib_ckpt_sectionread.header.size = sizeof  
> (struct res_lib_ckpt_sect
>                 res_lib_ckpt_sectionread.header.id =  
> MESSAGE_RES_CKPT_CHECKPOINT_SECTIO
>                 res_lib_ckpt_sectionread.header.error = error;
>
>                 if (section_size != 0) {
>                         res_lib_ckpt_sectionread.data_read =  
> section_size;
>                 }
>
>   here ==>              openais_conn_send_response (
>                         req_exec_ckpt_sectionread->source.conn,
>                         &res_lib_ckpt_sectionread,
>                         sizeof (struct res_lib_ckpt_sectionread));
>
>
> ==18010==    by 0x806138B: deliver_fn (main.c:357)
> ==18010==    by 0x805B939: app_deliver_fn (totempg.c:395)
> ==18010==    by 0x805B70D: totempg_deliver_fn (totempg.c:553)
> ==18010==    by 0x805AAC2: totemmrp_deliver_fn (totemmrp.c:81)
> ==18010==    by 0x805843D: messages_deliver_to_app (totemsrp.c:3439)
> ==18010==    by 0x80580B8: message_handler_orf_token (totemsrp.c:3318)
> ==18010==    by 0x805A8EA: main_deliver_fn (totemsrp.c:4023)
> ==18010==    by 0x804EBC9: none_token_recv (totemrrp.c:506)
> ==18010==    by 0x80504BB: rrp_deliver_fn (totemrrp.c:1308)
> ==18010==    by 0x804CC2B: net_deliver_fn (totemnet.c:679)
> ==18010==    by 0x804B17C: poll_run (aispoll.c:402)
> ==18010==    by 0x8061C8F: main (main.c:594)
> ==18010==  Address 0x41D51B8 is 8 bytes inside a block of size 188  
> free'd
> ==18010==    at 0x401CFCF: free (vg_replace_malloc.c:235)
> ==18010==    by 0x80640D5: conn_info_destroy (ipc.c:327)
> ==18010==    by 0x8064524: prioritized_poll_thread (ipc.c:456)
> ==18010==    by 0x4032340: start_thread (in /lib/tls/i686/cmov/ 
> libpthread-2.3.6.so)
> ==18010==    by 0x41084ED: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)
> ==18010==
>
> PROBLEM 3:
> ------------------
>
> unsolved leak:
>
> totemsrp.c: there is 2 TODO LEAK that really leak but i can figure  
> out where this block should be freed (iovec and mcast).
>
>
> <patch-leak>
> _______________________________________________
> Openais mailing list
> Openais at lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/openais





More information about the Openais mailing list