[cgl_discussion] [firstname.lastname@example.org: 2.6.0 and
pkathail at cisco.com
Sun Jan 18 09:50:24 PST 2004
Soory for going in and out of this discussion. Yes, checkpoint is like
a database service and should like distributed database for a cluster
based system. I do not believe that was the original question.
Original question was, should saving of data be implicit (i.e. OS should
save the entire context of the application) or explicit (i.e. application
identifies what needs to be saved.
As I said in my earlier email, one size does not fit all and we probably
have to support both schemes. Implicit scheme provides a good migration
opportunity for legacy applications.
At 12/8/2003 02:28 PM +0800, Zhao, Forrest wrote:
>I have a little idea about "second checkpoint service".
> From the perspective of functionality, I think the checkpoint service defined by AIS is much similar to a distributed database system. In particular, the active process store the states to certain media (RAM or disk) at certain point; if the active process dies, the standby process will retrieve the states from the media and replace the dead process. A database system can just realize this store/retrieve operation.
>So I think a distributed database system can meet the requirement defined by data checkpoint service.
>Am I right?
>From: Pradeep Kathail [mailto:pkathail at cisco.com]
>Sent: 2003$BG/(J12$B7n(J8$BF|(J 14:05
>To: Peter Badovinatz; Zhao, Forrest
>Cc: Rusty Lynch; cgl_discussion at osdl.org
>Subject: Re: [cgl_discussion] [coyote at coyotegulch.com: 2.6.0 and Checkpointing]
>I am not one size is going to fit all here. There will be some legacy
>applications that some communication vendors may not want to touch
>and will be more then happy to use the first kind of service to
>checkpoint the entire context. But from CGL perspective, I will not focus
>too much on this.
>CGL should focus on the second issue that is having an API that
>application developers can use to checkpoint information locally as
>well as to a remote machine. This allows application to be restarted
>locally as well as can also provide fully hot redundant system.
>At 11/25/2003 01:49 PM -0800, Peter Badovinatz wrote:
>>Zhao, Forrest wrote:
>>> I have some thoughts about data check point service, just share them with you.
>>> Generally speaking, there are two kinds of check-point services in terms of transparency.
>>> The first kind of check-point service is transparent to user/application. CHPOX is just such kind of service. The major advantage of this kind is: there is no need to modify the application programs, the check pointing is done transparent to apps. The disadvantage of this kind is: it must save the whole process running context, so this can lead to inefficiency caused by saving unrelated, redundant data.
>>> The second kind of check-point service is not transparent to user/apps.
>>> The data check point service defined by SAF(www.saforum.org)/AIS is such kind. The advantage of this kind is: the user can choose what specific data to check point, so reduce the volume of data to be saved. But the major disadvantage is: the developers have to insert the check-pointing API to apps in order to get data check-pointing service, so this kind of check-point service is not transparent to user/apps.
>>> So there is a tradeoff between two kinds of services. But I'm wondering if the carrier companies are willing to modify their product quality software in order to get check-pointing service?
>>They have been in the past. Carrier applications most certainly have
>>had strict control over their checkpointing. The SAF AIS checkpointing
>>service is predicated on this (and since a number of equipment
>>manufacturers were involved, they should have some idea).
>>> -----Original Message-----
>>Peter R. Badovinatz aka 'Wombat' -- IBM Linux Technology Center
>>preferred: tabmowzo at us.ibm.com / alternate: wombat at us.ibm.com
>>These are my opinions and absolutely not official opinions of IBM, Corp.
>>cgl_discussion mailing list
>>cgl_discussion at lists.osdl.org
More information about the cgl_discussion