[cgl_discussion] Use case - Cluster Locking Service

John Cherry cherry at osdl.org
Wed Mar 16 17:12:55 PST 2005

The following is a use case for a Cluster Locking Service.  This
addresses CLS.1.0 Cluster Lock Service (SA Forum AIS).  The "Scenarios"
section is thin, but perhaps CGL users of the cluster lock service can
help me fill in this section.   Enjoy.


Use Case: Cluster Lock Service (SA Forum AIS)

Version 0.2 Last Modified Date: 3/16/05

The basis for peaceful coexistence of cooperating parties is the ability
to share well. Cooperating applications within a cluster must
access to shared resources to avoid corruption of those resources. A
distributed lock manager (DLM) provides advisory locking services used
to coordinate access to any arbitrary shared resource.

The Service Availability Forum (SA Forum) Application Interface
Specification (AIS) describes the application interface to the lock
service in Volume 7 of the specification.  It is intended for use
by implementors of the AIS and by application developers who would
use the AIS to develop applications that must be highly available.
See http://www.saforum.org/ for the latest version of the specification.

The cluster lock service does not enforce good sharing behavior, but
rather provides cooperating applications with information regarding
the state of a shared resource. All locks are advisory, that is,
voluntary. The system does not enforce locking. Instead, applications
running on the cluster must cooperate for locking to work. An
that wants to use a shared resource is responsible for first obtaining
a lock on that resource before attempting to access it.

The lock manager defines a lock resource as the lockable entity. The
lock manager creates a lock resource the first time an application
requests a lock against it. A single lock resource can have one or many
locks associated with it. A lock is always associated with one lock
resource. The lock manager provides a single, unified lock image shared
among all nodes in the cluster.

The cluster lock service expects to operate in a cluster in conjunction
with another cluster infrastructure environment that provides a
view of cluster membership (all nodes agree on cluster membership)
and node liveness (the node is a healthy part of the cluster).

When a node fails, the cluster lock service instances running on the
surviving cluster nodes release the locks held by the failed node. The
cluster lock service then processes lock requests from surviving nodes
that were previously blocked by locks owned by the failed node.

Desired Outcome
Core cluster services such as a cluster membership service and a
cluster lock service should be common across clustering implementations.
Eventually, these services may become mainline Linux kernel capabilities
and a Linux kernel could be configured as a “clusterable” kernel.
The SA Forum AIS provides a well defined interface to these services.
It is important that the consumers of the core resources such as
shared strorage, clustered filesystems, resource managers, and custom
applications have a well defined interface to the core services and that
they can be developed against these standard interfaces with assurances
that the interfaces will be stable over time.

A couple of open source user level clustering implementations are
leveraging SA Forum AIS (Linux-HA and OpenAIS) .  Neither of these
projects currently have a cluster lock service.  It is the intent of
projects to leverage the work that the linux-clusters team is doing with
a version of OpenDLM.  The linux-clusters DLM uses a programming model
for distributed locking that has stood the test of time.  It is the
one first popularized on the VAX Cluster [Roy G. Davis, ``VAX Cluster
Principles'', Digital Press, 1993]. This model then became widespread on
proprietary Unix(tm) after Oracle required its usage for Oracle Parallel
Server and its predecessors.  An interface will probably need to be
added to this DLM to create SA Forum-type interfaces, which is really
a subset of the VAX Cluster interfaces.  There are several proprietary
clustering products that are being developed with SA Forum interfaces,
but these are out of the scope of this usage case.  Suffice it to say
that proprietary clustering implementations would also benefit from a
common set of open cluster services in the Linux kernel.

An OSDL Special Interest Group (SIG) has been established for ongoing
discussions regarding common open source clustering services.
It is likely that this group will define the common cluster
services and drive implementations into the kernel (where needed).
See  http://developer.osdl.org/dev/clusters/.  Andrew Morton has stated
publically that common clustering services will need to be supported by
the clustering community and not just single clustering projects.

Cluster lock server developers:  Both open source and proprietary
implementations are being developed with SA Forum interfaces.

Application developers: Applications which access shared resources in
a cluster would be written with SA Forum interfaces to leverage the
cluster lock service to coordinate access to these shared resources.

Applications and services that can benefit from using a distributed
lock manager are transaction-oriented, such as a database or a resource
controller or manager. It can also be used for system services such as
a distributed filesystem for shared files or meta-data.

Cluster aware applications - TBS 
Clustered volume management - TBS
Clustered filesystem - TBS 
Cluster-wide resource management - TBS 
Cluster administration - TBS

Implementation Notes
These Implementation Notes apply to all scenarios.  These are just
guidelines, not cast in concrete.

The Cluster Lock Service provides entities, called lock resources,
that are used to synchronize access to shared resources.

The Cluster Lock Service provides a simple lock model supporting two
locking modes for exclusive access and shared access.  All
must offer synchronous and asynchronous calls, lock timeout, trylock,
and lock wait notifications.  Implementations may optionally offer the
additional features of deadlock detection and lock orphaning.

The locks provided by the lock service are not recursive.  Thus, claimin
one lock does not implicitly claim another lock; rather, each lock must
be claimed individually.

The following is a guideline for mastering lock resources. Within
this cluster-wide lock image, the lock manager maintains one master
copy of each lock resource. This master copy can reside on any cluster
node. Initially, the master copy resides on the node on which the lock
request originated.  The lock manager maintains a cluster-wide directory
of the locations of the master copy of all the lock resources within the
cluster. The lock manager attempts to evenly divide the contents of this
directory across all cluster nodes. When an application requests a lock
on a lock resource, the lock manager first determines which node holds
the directory entry and then reads the directory entry to find out which
node holds the master copy of the lock resource.  By allowing all nodes
to maintain the master copy of lock resources, instead of having one
primary lock manager in a cluster, the lock manager can reduce network
traffic in cases when the lock request can be handled on the local
node. Handling the requests on the local node also avoids the potential
bottleneck resulting from having one primary lock manager and reduces
the time required to reconstruct the lock database when a failover

Greg Pfister, ``In Search of Clusters'', Second Edition, Prentice Hall
PTR, 1998

Tushar Chandra, Vassos Hadzilacos, Sam Toueg, ``The Weakest Failure
Detector for Solving Consensus'', June 1996

Danny Dolev and Dalia Malki, ``The Transis Approach to High Availability
Cluster Communication'', Comm. of the ACM, Vol 39, No. 4, April 1996,
pp. 64-70

Robbert van Renesse, Kenneth P. Birman, Silvano Maffeis, ``HORUS: A
flexible Group Communication System'', Comm. of the ACM, Vol 39, No. 4,
April 1996, pp. 76-83

Kenneth P. Birman, Building Secure and Reliable Network Applications,
Manning Publishing Company and Prentice Hall, 1997

Ken Birman, et. al, The Horus and Ensemble Projects: Accomplishments
and Limitations, circa 2000

Roy G. Davis, ``VAX Cluster Principles'', Digital Press, 1993

Chuck Simmons, Patty Greenwald, ``Oracle Lock Manager Requirements'',
Oracle Corportation, July 1994

Kristin Thomas, ``Programming Locking Applications'', IBM Corporation,

More information about the cgl_discussion mailing list