[Openais] CKPT: bug, global_ckpt_id not synced
Steven Dake
sdake at redhat.com
Thu Sep 7 08:41:40 PDT 2006
Hans,
I am also working on this particular problem, although "max" is not
sufficient sine it is possible for checkpoint ids to wrap. Ideally the
checkpoint ids would wrap with a proper action.
I don't know if it is realistic for a checkpoint id to wrap, but I'd
like it to work if this were to happen after 1-2 years of runtime in
heavy checkpoint create/unlink environments.
Regards
-steve
On Thu, 2006-09-07 at 09:01 +0200, Hans Feldt wrote:
> Steven/Muni, could you please comment on this issue?
>
> I believe it could be the root of much evil. I would like to get this
> committed asap since it is a stopping issue for our testing.
>
> Regards,
> Hans
>
> Hans Feldt wrote:
> > Test case:
> > - start first node
> > - create (with data) checkpoint 1 on first node
> > - create (with data) checkpoint 2 on first node
> > - start 2nd node
> > - create (with data) checkpoint 3 on 2nd node
> > - read checkpoint 3 on first node (fails without patch)
> >
> > There seems to be more errors related to the ckpt_id which was
> > introduced in r1139. Stay tuned or help us out.
> >
> > Regards,
> > Hans
> >
> >
> > ------------------------------------------------------------------------
> >
> > Index: ckpt.c
> > ===================================================================
> > --- ckpt.c (revision 1238)
> > +++ ckpt.c (working copy)
> > @@ -345,6 +345,7 @@
> >
> > DECLARE_LIST_INIT(checkpoint_recovery_list_head);
> >
> > +/* cluster wide synchronized checkpoint ID */
> > static mar_uint32_t global_ckpt_id = 0;
> >
> > struct checkpoint_cleanup {
> > @@ -2105,6 +2106,11 @@
> > log_printf (LOG_LEVEL_DEBUG, "recovery CHECKPOINT reopened is %p\n", checkpoint);
> > }
> >
> > + /* synchronize global_ckpt_id to max(ckpt_id,global_ckpt_id)+1 */
> > + if (ckpt_id > global_ckpt_id) {
> > + global_ckpt_id = ckpt_id + 1;
> > + }
> > +
> > /*CHECK to see if there are any existing ckpts*/
> > if ((checkpoint->ckpt_refcnt) && (ckpt_refcnt_total(checkpoint->ckpt_refcnt) > 0)) {
> > log_printf (LOG_LEVEL_DEBUG,"calling merge_ckpt_refcnts\n");
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Openais mailing list
> > Openais at lists.osdl.org
> > https://lists.osdl.org/mailman/listinfo/openais
>
> _______________________________________________
> Openais mailing list
> Openais at lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/openais
More information about the Openais
mailing list