[Openais] RE: Several questions about Group Messaging Interface

Steven Dake sdake at mvista.com
Mon Jun 28 13:14:25 PDT 2004


On Sun, 2004-06-27 at 20:35, Zhao, Forrest wrote:
> Hi, Steve
> 
> > If you don't mind, I'd like to copy most of this message to the
> openais
> > mailing list so others can benefit from the info.
> 
> It's great to share knowledge with others :)
> Thanks for you detailed explanation. Now I have a clear picture of 
> safe ordering and strong membership guarantee :)
> 
> I have 3 more questions for this topic:
> 
> 
> >> 2 another item "Add support for low delivery-time delay FIFO
> messages".
> >> Does it mean you'll make some optimization to group communication
> layer
> >> to
> >> reduce the delivery delay of FIFO messages?
> 
> > It is possible to reduce latency on FIFO type messages by ignoring the
> > agreed ordering rule.  The improvement would not be too dramatic,
> except
> > for large rings.
> 
> Why do you think "the improvement would not be too dramatic,
> except for large rings."?
> For FIFO ordering, the message order is only agreed by sender and
> Receiver; but for agreed ordering, the message order is agreed by
> all receivers, which will need several more rounds of message exchange.
> In more specific words, I think for agreed ordering a token need to
> be passed node by node in order to achieve the agreed ordering rule;
> however for FIFO ordering, the token is not necessary and the 
> traditional reliable multicast can fulfill the FIFO ordering
> requirement.
> Do you think what I say is reasonable?
> 

Actually there is really not more rounds on an agreed network.  The
advantage comes when a message can be delivered.  A FIFO message can be
delivered earlier because it doesn't have to wait for fragmentation to
assemble the previous agreed order packet.

This latency to delivery could be an improvement.  For the moment,
agreed ordering meets all of the requirements of openais.

> >> What types of ordering have been supported in current release?
> >>
> 
> > AGREED ordering is supported in current release.  FIFO delivery can be
> > fulfilled by AGREED ordering.
> 
> How can a user specify what kind of ordering he would use?
> That is if a user want use different ordering services, whether he'll
> invoke different API or he'll invoke same API with different input
> parameters?
> Also as we know there're 3 types of ordering: FIFO, causal and agreed
> ordering.
> Do you have any plan to support causal ordering? Or do you plan to 
> provide an API, which will provide causal ordering service to user?
> Also although causal ordering can fulfilled by agreed ordering, do you 
> plan to make some optimization for causal ordering to enhance the 
> performance?
> 

The gmi_mcast function should be extended to take a "message type"
argument which could be FIFO, AGREED, or SAFE.  It currently doesn't do
this.

The ring protocol in use by openais can't provide any optimizations for
causal ordering (that I can figure out).

The ring protocol is a send-time protocol, meaning that the order of
messages is determined at message send time.  The other class of
protocols are receive time protocols where message ordering is
determined at receive time.  Receive time protocols can provide
advantages for causal ordering.  Unfortunately these receive time
protocols are very complicated and require constant messaging, or they
cannot build the agreed ordering of the messages.  Agreed ordering is
much more important for openais then causal ordering.


> 
> The last question is the question that has confused me for a long time!
> As we know there're two kinds of membership service. The one is node
> level
> (i.e. processor level)membership, the other is process level membership.
> So the real situation in the cluster may be like follows:
> 1 Node A, B, C and D form the node level membership view
> 2 process 1 on node A and process 2 on node B form the process level
> membership relationship(i.e. group); there's no other process 
> level membership in cluster.
> 
> My question is that: the agreed safe ordering is provided on node level
> membership
> or on process level membership? In the above case, the token will be
> passed
> from A->B->A or A->B->C->D->A?
> Moreover I think if being implemented at node level, the messages sent
> by
> process 1 on node A need to be acked by B, C, D; however if being
> implemented at process level, the messages sent by process 1 on node A
> only
> need to be acked by B. Am I right?
> What's the type of your current implementation?
> 
Process level ordering and processor ordering are the same (the
processor ordering is used).  Currently openais only implements
processor level membership, since process level membership is
unnecessary for openais (there is only one process group).

The ring protocol used does not require acks.  It is a class of protocol
called a "negative ack" protocol.  An ORF (ordering, reliability, flow
control) token is passed around a ring of processors.  As the token
enters a processor, it re-mcasts any messages in a list (of
configuration,sequence ids) in the token that the previous nodes don't
have a copy of.  If the processor has some messages it is missing, it
adds those message sequence ids to the list.

The sequence id is stored in the token, and increased each time a
message is multicast on the ring.

Keeping track of when to free the messages is very difficult, as well as
tracking fragmentation, since some switches will fail to make forward
progress on fragmented UDP packets.  For this reason, the protocol must
fragment and use its own recovery mechanisms (since UDP has none).  Of
notable difficulty as well is configuring the ring configuration
(membership) for the ORF token to be initiated.


> Thanks,
> Forrest




More information about the Openais mailing list