[cgl_discussion] Re: [dcl_discussion] ANNOUNCE: OSDL Clusters
lmb at suse.de
Thu Nov 27 01:47:58 PST 2003
Steven Dake <sdake at mvista.com> said:
> The one place the kernel can really help, and should not be dismissed
> lightly, is reliable totally ordered messaging. This is a clear
> requirement of any clustering infrastructure (including cluster
> membership) and is best implemented by interrupt-driven timer sources.
Yes. I fully agree.
Actually I have an interesting offer here. The Trans-Is project has
implemented such a group messaging service
(http://www.cs.huji.ac.il/labs/transis/transis.html). I have spoken with
professor Dolev, and he would be willing to provide this suite to the
Open Source community.
However, the Linux HA project lacks the resources to fully appreciate
this offer. Maybe this is something where OSDL could chime in?
(The algorithms are all there. But, Trans-is is implemented fully in
user-space, so eventually a port to kernel space to meet the OSDL
requirements may be necessary, even if I don't agree with those
> Without totally ordered messaging, properly implementing distributed
> application failover for 100% of failure cases is (*not impossible, but
> close*). Totally ordered reliable messaging that doesn't violate
> causility can be implemented in user space, but then poll must be used
> to simulate timers, which really doesn't work that well.
We are getting side-tracked, but total ordering is a much stricter
requirement than causal ordering, and neither does per se require timers
(though they could be used in partially synchronous setups to optimize
I'm going to subscribe to osdl-clusters now and propose we take that
discussion there ;-)
Lars Marowsky-Brée <lmb at suse.de>
High Availability & Clustering \ ever tried. ever failed. no matter.
SUSE Labs | try again. fail again. fail better.
Research & Development, SUSE LINUX AG \ -- Samuel Beckett
More information about the cgl_discussion