[Openais] [whitetank / corosync trunk] Fix checkpoint sync in certain scenarios

Steven Dake sdake at redhat.com
Fri Nov 7 01:24:38 PST 2008


In a certain rare scenario, the checkpoint service throws away the
current checkpoint database.

An example of when this occurs is when there are 3 nodes A, B, C, node A
and C are killed then node B syncs.  After this completes, Node C is
started and node B again begins resyncing, but during this sync process
node A starts up.

This results in node b no longer believing it is required to sync its
current database contents.  The abort called on node b throws away all
checkpoints in the system but since node b is no longer the lowest node
id in the system it believes it doesn't have to sync.

The design change is that once a node has been declared as a responsible
for synchronization, any aborts or configuration changes will never
change the fact that node is still responsible for synchronization.

Regards
-steve
-------------- next part --------------
A non-text attachment was scrubbed...
Name: whitetank-ckpt-fix-sync.patch
Type: text/x-patch
Size: 2725 bytes
Desc: not available
Url : http://lists.linux-foundation.org/pipermail/openais/attachments/20081107/3d99f46c/attachment.bin 


More information about the Openais mailing list