[Openais] [Pacemaker] Pacemaker on OpenAIS, RRP, and link failure

Lars Marowsky-Bree lmb at suse.de
Thu Jun 4 09:30:12 PDT 2009


On 2009-06-04T09:23:04, Steven Dake <sdake at redhat.com> wrote:

> The problem with checking the link status with the current code is that
> the protocol blocks I/O waiting for a response from the failed ring.
> This could of course be modified to behave differently.

Right, so the rechecking could possibly be a separate thread, sending an
occasional liveness packet on the failed ring and trigger the RRP
recovery after it has heard from other nodes on it?

Some smarts would be needed of course to not constantly retrigger
partially active rings (which would fail again immediately).

> So the act of failing a link is expensive and we dont want to retest
> that it is valid very often.

Does "expensive" mean that it'll actually slow down the healthy
ring(s)?


Regards,
    Lars

-- 
SuSE Labs, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde



More information about the Openais mailing list