[Linux-cluster] Re: [cgl_discussion] Re: [dcl_discussion] Cluster summit materials

Daniel McNeil daniel at osdl.org
Wed Aug 11 14:24:49 PDT 2004

On Wed, 2004-08-11 at 11:54, Daniel Phillips wrote:
> On Wednesday 11 August 2004 12:18, John Cherry wrote:
> > On Tue, 2004-08-10 at 13:58, Lars Marowsky-Bree wrote:
> > > On 2004-08-10T13:49:26, John Cherry <cherry at osdl.org> said:
> > > > While these common components all have RHAT/Sistina roots, these
> > > > components are in the best position for mainline acceptance.  As
> > > > APIs are defined for these services, other implementations could
> > > > also be used (the vfs model).
> > >
> > > This isn't quite true. cman as a whole is not quite in the best
> > > position for mainline acceptance; actually, most isn't.
> That's accurate, that's why I keep beating on the 'read the code' issue, 
> not to mention trying it, and hacking it.
> > I realize that cman will probably be at "alpha" level maturity in
> > October, but we did not discuss any other possibilities for kernel
> > level membership/communication.
> I believe it was briefly mentioned that we mainly use bog-standard tcp 
> socket streams for communication.  I'll add that various subsystems 
> incorporate their own reliability logic, and maybe one day far from 
> now, we'll be able to unify all of that.  For now, it's a little 
> ambitions, not to mention unnecessary.
> > linux-ha and openais have user level  
> > components.  I suppose SSI membership could be considered as a
> > candidate implementation for the initial merge, but the consensus was
> > that we would focus on cman, define the APIs, and use cman as the
> > initial membership/communication module.  Multiple implementations
> > would be good and if we do a good job defining the APIs (membership,
> > communication, fencing), other membership services could be used down
> > the road.
> IMHO, for the time being only failure detection and failover really has 
> to be unified, and that is CMAN, including interaction with other bits 
> and pieces, i.e., Magma and fencing, and hopefully other systems like 
> Lars' SCRAT.  As far as CMAN goes, Lars and Alan seem to be the main 
> parties outside Red Hat.  Lon and Patrick are most active inside Red 
> Hat.  I think we'd advance fastest if they start hacking each other's 
> code (anybody I just overlooked, please bellow).

I not sure what you mean by "failure detection and failover".
Do you mean node failure detection and consensus membership change?

I thought Magma is just redhat's backward compatibility layer.
What "interaction" are you worried about?

How fencing integrates and when it occurs might be issues we
will need to think about more.

> However it goes, this process is going to take time.  Two months would 
> be blindingly fast, and that is before we even think about pushing to 
> Andrew.
> > Was I at a different summit than you attended, or is that your
> > understanding of the strategic direction of moving Linux to be a
> > "clusterable kernel"?
> That seemed to be the concensus at the summit I attended.  Note that 
> we've already got the basic changes to the VFS in place, with a few 
> small exceptions.
> I still think that gdlm can go to Andrew before CMAN, however that is 
> contingent on working out a way to invert the link-level dependency on 
> CMAN so that the OCFS2 guys and people who want to experiment with 
> dlm-style coding can try it without being forced to adopt a lot of 
> other, less stable infrastructure at the same time.  This will be going 
> forward in parallel with the CMAN api work.

How can the DLM go to Andrew without a membership layer to
provide membership?

I would think we need the DLM to actually be working...

> > How can we have membership without some form of communication
> > service? (communication-based membership or connectivity-based
> > membership)
> >
> > The low level cluster communication mechanism is one of those
> > services that I believe we need an API definition for since it will
> > also be leveraged by higher level services such as group messaging or
> > an event service.
> >
> > So you can call the core service "membership", but what we really
> > need is membership/communication, which is what cman provides.  Do
> > you have another suggestion for this?  TIPC + membership?
> I think you really mean "connection manager", not "communication 
> service"  I'll step back from this now and watch you guys sort it 
> out :-)

I think John really does mean communication.  For high availability,
the cluster should have no single point of failure.  This usually
means multiple ethernet links.  (I assume CMAN supports multiple
links).  To determine membership there needs to be a way of sending
messages between the nodes to determine membership.  Ideally, losing one
ethernet link could/would be handle without causing any membership

This kind of intra-cluster communication would be valuable for 
other cluster components as well.  Example: a cluster snapshot :)
or cluster mirror device should be able to send messages to
other nodes in the cluster without having to worry about which
specific link to use and what to do if a link fails.  This would
also be valuable for the DLM.

Does CMAN provide this kind of functionality?  If so, then it
really is a communication service.

Daniel McNeil 

> Regards,
> Daniel
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster

More information about the cgl_discussion mailing list