tabmowzo at us.ibm.com
Thu Feb 13 17:10:59 PST 2003
Jon Maloy wrote:
> Corey Minyard wrote:
> > Another, somewhat related question. How do you handle system
> > partions? This was the nastiest problem we had to deal with
> > (especially since we were multi-site distributed).
> As for now we can not (logically) partition clusters, since the
> subnetwork concept is not
> fully implemented. It is however possible to physically/geographically
> partition clusters
> if we let the inter-site links go over UDP or TCP. The condition is that
> we still configure
> the cluster as a full-connectivity network. Of course, one must as
> always understand the
> bandwidth constraints in such cases. Again, this has never been tried in
> real products, so
> this is an unknown factor for now.
I believe the issue is not intentional partitioning (i.e., 'subsetting')
of the cluster, rather, when the cluster is split due to failures in the
communications links such that we end up with two (sub)clusters, both
may remain fully interconnected internally, but each individual
(sub)clusters has lost all communications links to the other
This is also called a split-brain situation, or I used to use the term
"sundered" cluster, because partition is too overloaded.
I believe that at the TIPC level, the nodes contained in each
(sub)cluster would consider all nodes in the other (sub)cluster to be
dead, however, they aren't. This becomes of interest when there are
resources (disks, applications, etc.) that are split across the two
(sub)clusters. A traditional technique is majority quorum, whichever
(sub)cluster has N/2+1 nodes wins and continues operating (or, pick some
algorithm, there are many) and the other (sub)cluster releases all
resources (or shuts down, whatever).
Personally, I don't think this decision belongs with TIPC. TIPC
provides messaging and notification, but a layer "above" TIPC should be
the owner of the policy decisions. TIPC should just keep providing
service to the nodes that can be reached.
Note that I'm NOT saying that whatever this policy layer is doing to
deal with split-brain is simple, far from it, just that TIPC shouldn't
worry about it.
> The way we have partitioned our systems is to always configure each site
> as separate zones
> (clusters), because we had other restraints making this most practical.
> Inter-zone links
> can then be set up via udp, but the "location transparency" stops at the
> cluster edge.
> > - -Corey
Peter R. Badovinatz aka 'Wombat' -- IBM Linux Technology Center
preferred: tabmowzo at us.ibm.com / alternate: wombat at us.ibm.com
These are my opinions and absolutely not official opinions of IBM, Corp.
More information about the cgl_discussion