[Openais] whitetank cluster not reforming after 'if down'

Steven Dake sdake at redhat.com
Thu Jul 30 09:07:40 PDT 2009


xen bridging enabled?  OpenAIS starts before xen bridging starts?  This
has been a problem...

Regards
-steve

On Thu, 2009-07-30 at 14:54 +0200, Andrew Beekhof wrote:
> Steve, I've been able to reproduce this reliably _without_ Pacemaker  
> being involved.
> 
> Attached are the two openais log files.
> 
> Scenario:
> 
> t0: hikari and hikari2 are up and can see each other
> t1: Powercycle hikari2
> t2: hikari2 comes up
> t3: ping confirms that hikari2 can contact hikari
>      I modified the openais init script to run: ping -c 10 hikari >  
> afile
> t4: hikari2 starts openais
> t5: hikari starts producing membership events every 3s but does not  
> form a membership with hikari2
> t6: hikari2 forms a membership by itself
> t7: (about 1 or 2 minutes after t6, it varies) hikari and hikari2 form  
> a combined membership
> 
> The strangest part of this, is that hikari2 must reboot in order to  
> trigger this behavior.
> Stopping or killing aisexec and then starting it again is not  
> sufficient.
> 
> Do you want to continue the discussion here or move to bugzilla?
> 
> 
> On Jul 21, 2009, at 1:49 PM, Lars Marowsky-Bree wrote:
> 
> > On 2009-06-30T12:27:33, Andrew Beekhof <andrew at beekhof.net> wrote:
> >
> >> I'm working with a cluster that's having trouble reforming.
> >> Before I explain, here is the totem section (which is the same on  
> >> both
> >> nodes, except for the nodeid).
> >
> > Hi all, Steven,
> >
> > this problem persist. After a reboot, we sometimes see memberships not
> > reforming - for example, A B C D E, C & D reboot, we end up with A-B-E
> > and C-D or C / D by themselves or some other really weird membership.
> >
> > The problem persist with latest whitetank. Occassionally it seems that
> > one of the dlm_controld processes seems to be hogging IPC (which seems
> > to be quite affecting the rest of the system), but this isn't always  
> > the
> > case.
> >
> > It is not always reproducible, and the symptoms are, well, weird.
> >
> > Has anyone else ever seen this?
> >
> >
> >
> > Regards,
> >    Lars
> >
> > -- 
> > Architect Storage/HA, OPS Engineering, Novell, Inc.
> > SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
> > "Experience is the name everyone gives to their mistakes." -- Oscar  
> > Wilde
> >
> 
> -- Andrew
> 
> 
> 
> _______________________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/openais



More information about the Openais mailing list