[Openais] Re: [saf-open] [ANNOUNCE] OpenAIS open source project to implement SAForum AIS APIs

Steven Dake sdake at mvista.com
Wed Jun 23 18:53:23 PDT 2004


Jocelyn,

I've instead sent this to the openais mailing list since its more a
focus of the openais vs saf-open...


On Wed, 2004-06-23 at 18:36, j-cambria at ab.jp.nec.com wrote:
> Hi All,
> My name is Jocelyn, I'm new to the mailing list, and actualy pretty new to SAF altogether.
> I'm trying to get a good understanding of the spec, and I already asked a few questions to Steven (thanks again !).
> As he suggested, this may be of common interest so I'd like to share my thoughts about the spec and its implmentations, 
> and welcome  
> any comment !
> 
> 1/ First, the switchover to virtual synchrony in openAIS: how is that different from TCP/IP ? What are the main points 
> of doing this shift ?
> 

Virtual syncrhony provides many benefits:
1. agreed ordering for reliable state machines
2. real-time performance
3. fast failure detection and configuration changes
4. messages are delivered under a known configuration for reliability
5. multicast since most AIS commands are sent to every node, multicast
is a big performance win.  Without multicast, it is necessary to unicast
to each node.
6. strong semantics during a partition (network split)
7. strong semantics during a remerge
8. self delivery

Agreed ordering means that all nodes agree upon the order of messages. 
Think of a bank.  If you have 500 dollars in your account, and you add
500, and withdraw 750, it matters what order you do it in.  If you do it
in the wrong order, or some banks do it in one order vs other banks,
then the whole system gets out of wack.  This is the problem with
distributed state machines that do not use agreed ordering semantics and
why implementing a DSM in vs is so simple.  The alternative is to
implement n-phase commit protocols which perform poorly, are
complicated, and behave poorly under partitions and merges.

Another key point is that a message is delivered under a known
configuration.  So a message is delivered in configuration 1 to all
nodes that were part of that configuration.  If a configuration change
occurs, the message will never be delivered in configuration 2, but
instead in configuration 1 (or not at all).  Again a reliability issue. 
If all nodes see the same messages in the same configuration, it is alot
easier to guarantee correct operation during configuration changes.

There is probably something I am missing but that should help provide a
start.

> 2/ About the global architecture of the spec: it sounds like the cluster membership service is a dependency for 
> the 5 other services. It's also been a P1 in CGL since the 2.0 spec, and it seems to be the only one service that 
> involves kernel-level features. So my question is, would it make sense in an AIS implementation, to have the 
> cluster membership implemented at the kernel level, and the other 5 services implemented on top, in the user space ?
> 
> Actualy the spec AIP seems high-level enough to allow a full implementation at user level, without touching the kernel
> (which I personaly like). Am I misunderstading the spec at this point, or does that sound reasonable ?
> 
I think your right.  If someone wants to take the virtual synchrony
layer (which includes built-in membership which is a requirement of VS)
and port it to the kernel, I'm happy to help.  For the moment, I'd like
to focus on getting the APIs implemented, though.

Thanks!

> Anyway, thanks is advance for any comment on this !
> 
> Bests,
> 
> Jocelyn
> 
> 
> ______________________________________________________________________
> _______________________________________________
> saf-open mailing list
> saf-open at lists.osdl.org
> http://lists.osdl.org/mailman/listinfo/saf-open




More information about the Openais mailing list