[Openais] CKPT: bug, global_ckpt_id not synced

Hans Feldt Hans.Feldt at ericsson.com
Fri Sep 8 00:02:26 PDT 2006


Steven, while waiting for your ckpt rework can I commit this change 
since the problem is stopping for us? Or are you about to release something?

Regards,
Hans

Steven Dake wrote:
> Hans,
> 
> I am also working on this particular problem, although "max" is not
> sufficient sine it is possible for checkpoint ids to wrap.  Ideally the
> checkpoint ids would wrap with a proper action.
> 
> I don't know if it is realistic for a checkpoint id to wrap, but I'd
> like it to work if this were to happen after 1-2 years of runtime in
> heavy checkpoint create/unlink environments.
> 
> Regards
> -steve
> 
> On Thu, 2006-09-07 at 09:01 +0200, Hans Feldt wrote:
> 
>>Steven/Muni, could you please comment on this issue?
>>
>>I believe it could be the root of much evil. I would like to get this 
>>committed asap since it is a stopping issue for our testing.
>>
>>Regards,
>>Hans
>>
>>Hans Feldt wrote:
>>
>>>Test case:
>>>- start first node
>>>- create (with data) checkpoint 1 on first node
>>>- create (with data) checkpoint 2 on first node
>>>- start 2nd node
>>>- create (with data) checkpoint 3 on 2nd node
>>>- read checkpoint 3 on first node (fails without patch)
>>>
>>>There seems to be more errors related to the ckpt_id which was 
>>>introduced in r1139. Stay tuned or help us out.
>>>
>>>Regards,
>>>Hans
>>>
>>>
>>>------------------------------------------------------------------------
>>>
>>>Index: ckpt.c
>>>===================================================================
>>>--- ckpt.c	(revision 1238)
>>>+++ ckpt.c	(working copy)
>>>@@ -345,6 +345,7 @@
>>> 
>>> DECLARE_LIST_INIT(checkpoint_recovery_list_head);
>>> 
>>>+/* cluster wide synchronized checkpoint ID */
>>> static mar_uint32_t global_ckpt_id = 0;
>>> 
>>> struct checkpoint_cleanup {
>>>@@ -2105,6 +2106,11 @@
>>> 		log_printf (LOG_LEVEL_DEBUG, "recovery CHECKPOINT reopened is %p\n", checkpoint);
>>> 	}
>>> 
>>>+	/* synchronize global_ckpt_id to max(ckpt_id,global_ckpt_id)+1 */
>>>+	if (ckpt_id > global_ckpt_id) {
>>>+		global_ckpt_id = ckpt_id + 1;
>>>+	}
>>>+
>>> 	/*CHECK to see if there are any existing ckpts*/
>>> 	if ((checkpoint->ckpt_refcnt) &&  (ckpt_refcnt_total(checkpoint->ckpt_refcnt) > 0)) {
>>> 		log_printf (LOG_LEVEL_DEBUG,"calling merge_ckpt_refcnts\n");
>>>
>>>
>>>------------------------------------------------------------------------
>>>
>>>_______________________________________________
>>>Openais mailing list
>>>Openais at lists.osdl.org
>>>https://lists.osdl.org/mailman/listinfo/openais
>>
>>_______________________________________________
>>Openais mailing list
>>Openais at lists.osdl.org
>>https://lists.osdl.org/mailman/listinfo/openais
> 
> 
> 




More information about the Openais mailing list