[cgl_discussion] Re: [dcl_discussion] ANNOUNCE: OSDL Clusters (foundational components)

John Cherry cherry at osdl.org
Wed Nov 26 14:23:25 PST 2003

On Wed, 2003-11-26 at 13:27, Lars Marowsky-Bree wrote:
> On 2003-11-26T13:07:47,
>    John Cherry <cherry at osdl.org> said:
> John,
> please let me briefly voice my strong concern about starting yet another
> project for clustering. Linux already has many membership services etc;
> what would be needed is integration and shepherding and standard APIs,
> not another implementation! 

I understand exactly where you are coming from.  The intent of these
foundational projects is not to create yet another membership service. 
The goal is to make the linux OS to be a "clusterable OS".  The plan is
to leverage TIPC as an industry hardened connectivity/communication
layer in the kernel and to build a thin (perhaps even pluggable)
membership layer on top of it.  TIPC does most of the hard work.

I have heard all the arguments about implementing services in kernel
space.  The arguments for portability, vendor acceptance, etc. are still
there.  However, the intent of this project is to produce a clusterable
OS.  User space implementations can leverage these intrinsic membership
events or supply their own.  It really doesn't matter.  The interface
will be a standard one (SAF-AIS).

> That is not a good use of engineering resources.
> Are you proposing to supplement one of these, or do you really, really
> want to start one from scratch?!

The biggest part of this development is TIPC and that is leveraged from
the Ericsson implementation.  It is a kernel module that requires no
additional kernel hooks.  We are not starting this from scratch.  The
membership service will be implemented from scratch, but we certainly
are not ruling out working with other developers to leverage existing
implementations.  The main rationale for keeping the membership service
mainly in the kernel is for clean and efficient node isolation (both
murder and suicide).

> There are also reasonably strong reasons arguing against implementing
> these in kernel space. I'm sure you have heard the summary of them:
> Maintenance, mainline / vendor acceptance etc.
> Could you please clarify the rationale?
> > the network.  Applications and other cluster services (checkpointing,
> > resource managegers, etc.) would simply subscribe to membership events.
> Implementation for data checkpointing and some others actually already
> exist for Linux heartbeat, and all in user-space.

We don't plan on implementing these services.  The Linux heartbeat
services (checkpointing, resource managers, etc.) can choose to leverage
membership services from through any membership service with an SAF-AIS

I really appreciate your concerns Lars.  I don't want to reinvent any
wheels either.


More information about the cgl_discussion mailing list