[cgl_discussion] [coyote@coyotegulch.com: 2.6.0 and Checkpointing]

Zhao, Forrest forrest.zhao at intel.com
Sun Dec 7 22:28:25 PST 2003


I have a little idea about "second checkpoint service".

>From the perspective of functionality, I think the checkpoint service defined by AIS is much similar to a distributed database system. In particular, the active process store the states to certain media (RAM or disk) at certain point; if the active process dies, the standby process will retrieve the states from the media and replace the dead process. A database system can just realize this store/retrieve operation.

So I think a distributed database system can meet the requirement defined by data checkpoint service.

Am I right?

Thanks,
Forrest

-----Original Message-----
From: Pradeep Kathail [mailto:pkathail at cisco.com] 
Sent: 2003年12月8日 14:05
To: Peter Badovinatz; Zhao, Forrest
Cc: Rusty Lynch; cgl_discussion at osdl.org
Subject: Re: [cgl_discussion] [coyote at coyotegulch.com: 2.6.0 and Checkpointing]

I am not one size is going to fit all here. There will be some legacy
applications that some communication vendors may not want to touch
and will be more then happy to use the first kind of service to 
checkpoint the entire context. But from CGL perspective, I will not focus 
too much on this.

CGL should focus on the second issue that is having an API that 
application developers can use to checkpoint information locally as
well as to a remote machine. This allows application to be restarted
locally as well as can also provide fully hot redundant system.

Pradeep

At 11/25/2003 01:49 PM -0800, Peter Badovinatz wrote:
>Zhao, Forrest wrote:
>
>> I have some thoughts about data check point service, just share them with you.
>> 
>> Generally speaking, there are two kinds of check-point services in terms of transparency.
>> The first kind of check-point service is transparent to user/application.  CHPOX is just such kind of service. The major advantage of this kind is: there is no need to modify the application programs, the check pointing is done transparent to apps. The disadvantage of this kind is: it must save the whole process running context, so this can lead to inefficiency caused by saving unrelated, redundant data.
>> The second kind of check-point service is not transparent to user/apps. 
>> The data check point service defined by SAF(www.saforum.org)/AIS is such kind. The advantage of this kind is: the user can choose what specific data to check point, so reduce the volume of data to be saved. But the major disadvantage is: the developers have to insert the check-pointing API to apps in order to get data check-pointing service, so this kind of check-point service is not transparent to user/apps.
>> 
>> So there is a tradeoff between two kinds of services. But I'm wondering if the carrier companies are willing to modify their product quality software in order to get check-pointing service?
>
>They have been in the past.  Carrier applications most certainly have
>had strict control over their checkpointing.  The SAF AIS checkpointing
>service is predicated on this (and since a number of equipment
>manufacturers were involved, they should have some idea).
>
>> 
>> Thanks,
>> Forrest
>> 
>> 
>> -----Original Message-----
>> <snip>
>
>Peter
>-- 
>Peter R. Badovinatz aka 'Wombat' -- IBM Linux Technology Center
>preferred: tabmowzo at us.ibm.com / alternate: wombat at us.ibm.com
>These are my opinions and absolutely not official opinions of IBM, Corp.
>
>_______________________________________________
>cgl_discussion mailing list
>cgl_discussion at lists.osdl.org
>http://lists.osdl.org/mailman/listinfo/cgl_discussion 




More information about the cgl_discussion mailing list