[Openais] [corosync trunk] fix process pausing issue with membership algorithm

Chrissie Caulfield ccaulfie at redhat.com
Fri Jun 26 01:30:11 PDT 2009


Steven Dake wrote:
> When a process pauses for longer then the token timeout, the other
> processors in the system form a new ring.  The remaining processor then
> eventually reschedules and processes the pending membership multicast
> messages in its kernel queues.  This wreaks havok on the membership of
> the other nodes.
> 
> While a proper kernel shouldn't pause for long periods, its a reality
> that many kernels still have long periods of spinlocking without
> scheduling and no proper preemption.
> 
> This patch resolves the scenario by creating a timer which records a
> time stamp at an interval that is the token timeout / 5.  Then if a
> process executes the membership algorithm by receiving a join message,
> the current time is retrieved and compared to the timestamp.  If they
> differ by more then token timeout / 2, it is assumed the process
> couldn't schedule (because it couldn't trigger the timer callbacks via
> poll) and calls totemnet to flush any pending multicasts in the file
> descriptor responsible for receiving multicast messages.  This results
> in the old membership messages being thrown away allowing the new
> membership to form properly.
> 
> This can be tested by ctrl-z a corosync process in a 8 node cluster.
> Then use fg to bring it into the foreground.  Pre-patch - bad news -
> post patch, prints a notice and proceeds properly.
> 

At the bottom of the patch:

+       if (pause_flush (instance)) {
+               return (0);
+       }

will skip the rest of the routine if pause_flush encounters an error, as
well as if it flushes some messages ... is that intended behaviour ?

It's a consequence of overloading the return code to indicate not only
whether the operation succeeded or not, but also whether it flushed any
messages.  Perhaps there should be a pass-by-reference parameter for
&messages_flushed to keep them separate ?


Chrissie


More information about the Openais mailing list