[Openais] PATCH: aisexec leak & crash (updated)

Steven Dake steven.dake at gmail.com
Fri Sep 15 01:49:06 PDT 2006


Fabien,

There are two critical issues now which I am troubleshooting which are
higher priority.

These issues are:
1) With the CPG service, it is possible to overflow the input buffer of the
CPG clients if the CPG clients are too slow.

2) Checkpoint synchronization doesn't always work properly in failure
conditions after the changes to checkpoint unlink.  Also, occasionally a
checkpoint is closed when its reference count is zero in test cases.

I hope to have both of these issues resolved in the next 5-7 days after
which I will put these leaks at the to pof the list.

Regards
-steve

On 9/15/06, Fabien THOMAS <fabien.thomas at netasq.com> wrote:
>
> any news on this ?
>
> Le 30 ao=FBt 06 =E0 17:43, Fabien THOMAS a =E9crit :
>
> > After doing additionnal testing you can find attached the latest
> > patch for trunk and whitetank.
> > There is 3 problems pending that need a solution (but here i need
> > help) two ending by a crash of aisexec.
> >
> > Now aisexec can continuously run without growing dangerously in
> > size but it is still crashing in the reported case.
> >
> > SOLVED:
> > =3D=3D=3D=3D=3D=3D=3D
> >
> > - totemsrp.c: missing free in case of error.
> >
> > - ipc.c: private_data not freed
> >
> > - ckpt.c: misplaced call to hdb_destroy in the checkpoint module
> > (hdb was destroyed during iterator finalize).
> >
> > - all: pthread_mutex_destroy is never used:
> > i've tried to find the right place to destroy the mutex but it
> > needs to be checked by module owner.
> >
> > - all: pthread_attr_destroy, cond_destroy is never used (but not
> > too many leaks here)
> >
> > conn_info->shared_mutex is allocated by two thread and overwritten
> > by one so the mutex is lost:
> > here i've just done a quickfix but the real solution need to be
> > found later.
> >
> > PENDING:
> > =3D=3D=3D=3D=3D=3D=3D=3D
> >
> > PROBLEM 1: checkpoint iterator
> > ------------------
> >
> > a) iterator are broken when a recovery is started (internal
> > structure point to old checkpoint freed by the recovery process)
> >
> > Muni is aware of the problem but the code need to be reworked.
> >
> > b) ckpt_lib_exit_fn does not free iterator (there is a block of
> > TODO in the code).
> > the problem with this leak is that each time i run a client aisexec
> > will grow.
> >
> > PROBLEM 2: reference to conn_info after free
> > -----------------
> >
> > race condition somewhere during connection close: (i've this while
> > breaking the client application)
> >
> > it seems that checkpoint module access conn_info structure after
> > the structure is destroyed.
> > =3D=3D18010=3D=3D Invalid read of size 4
> > =3D=3D18010=3D=3D    at 0x8064248: libais_connection_active (ipc.c:332)
> > =3D=3D18010=3D=3D    by 0x80655D1: openais_conn_send_response (ipc.c:96=
3)
> >       struct conn_info *conn_info =3D (struct conn_info *)conn;
> >
> >         if (conn_info =3D=3D NULL) {
> >                 return -1;
> >         }
> > here =3D=3D>        if (!libais_connection_active (conn_info)) {
> >                 return (-1);
> >         }
> >
> > =3D=3D18010=3D=3D    by 0x8072CAF:
> > message_handler_req_exec_ckpt_sectionread (ckpt.c:3337)
> >
> >        /*
> >          * Write read response to CKPT library
> >          */
> > error_exit:
> >         if (message_source_is_local(&req_exec_ckpt_sectionread-
> > >source)) {
> >                 res_lib_ckpt_sectionread.header.size =3D sizeof
> > (struct res_lib_ckpt_sect
> >                 res_lib_ckpt_sectionread.header.id =3D
> > MESSAGE_RES_CKPT_CHECKPOINT_SECTIO
> >                 res_lib_ckpt_sectionread.header.error =3D error;
> >
> >                 if (section_size !=3D 0) {
> >                         res_lib_ckpt_sectionread.data_read =3D
> > section_size;
> >                 }
> >
> >   here =3D=3D>              openais_conn_send_response (
> >                         req_exec_ckpt_sectionread->source.conn,
> >                         &res_lib_ckpt_sectionread,
> >                         sizeof (struct res_lib_ckpt_sectionread));
> >
> >
> > =3D=3D18010=3D=3D    by 0x806138B: deliver_fn (main.c:357)
> > =3D=3D18010=3D=3D    by 0x805B939: app_deliver_fn (totempg.c:395)
> > =3D=3D18010=3D=3D    by 0x805B70D: totempg_deliver_fn (totempg.c:553)
> > =3D=3D18010=3D=3D    by 0x805AAC2: totemmrp_deliver_fn (totemmrp.c:81)
> > =3D=3D18010=3D=3D    by 0x805843D: messages_deliver_to_app (totemsrp.c:=
3439)
> > =3D=3D18010=3D=3D    by 0x80580B8: message_handler_orf_token (totemsrp.=
c:3318)
> > =3D=3D18010=3D=3D    by 0x805A8EA: main_deliver_fn (totemsrp.c:4023)
> > =3D=3D18010=3D=3D    by 0x804EBC9: none_token_recv (totemrrp.c:506)
> > =3D=3D18010=3D=3D    by 0x80504BB: rrp_deliver_fn (totemrrp.c:1308)
> > =3D=3D18010=3D=3D    by 0x804CC2B: net_deliver_fn (totemnet.c:679)
> > =3D=3D18010=3D=3D    by 0x804B17C: poll_run (aispoll.c:402)
> > =3D=3D18010=3D=3D    by 0x8061C8F: main (main.c:594)
> > =3D=3D18010=3D=3D  Address 0x41D51B8 is 8 bytes inside a block of size =
188
> > free'd
> > =3D=3D18010=3D=3D    at 0x401CFCF: free (vg_replace_malloc.c:235)
> > =3D=3D18010=3D=3D    by 0x80640D5: conn_info_destroy (ipc.c:327)
> > =3D=3D18010=3D=3D    by 0x8064524: prioritized_poll_thread (ipc.c:456)
> > =3D=3D18010=3D=3D    by 0x4032340: start_thread (in /lib/tls/i686/cmov/
> > libpthread-2.3.6.so)
> > =3D=3D18010=3D=3D    by 0x41084ED: clone (in /lib/tls/i686/cmov/libc-2.=
3.6.so)
> > =3D=3D18010=3D=3D
> >
> > PROBLEM 3:
> > ------------------
> >
> > unsolved leak:
> >
> > totemsrp.c: there is 2 TODO LEAK that really leak but i can figure
> > out where this block should be freed (iovec and mcast).
> >
> >
> > <patch-leak>
> > _______________________________________________
> > Openais mailing list
> > Openais at lists.osdl.org
> > https://lists.osdl.org/mailman/listinfo/openais
>
>
> _______________________________________________
> Openais mailing list
> Openais at lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/openais
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linux-foundation.org/pipermail/openais/attachments/200609=
15/6bb086b6/attachment-0001.htm


More information about the Openais mailing list