[Openais] Re: defect 169 fixed up (and 172)

Steven Dake sdake at mvista.com
Fri Oct 29 11:26:55 PDT 2004


On Fri, 2004-10-29 at 10:15, Mark Haverkamp wrote:
> On Fri, 2004-10-29 at 10:05 -0700, Mark Haverkamp wrote:
> 
> > 
> > I'm guessing that the mcast isn't happening from the send side.  I'll
> > add a results check to each of the sendmsg calls in gmi.c and see where
> > things are going wrong.
> > 
> > Mark.
> > 
> 
> OK, here is the results of printing out res:
> 
> 
> 
> 
> Oct 29 10:08:13 [WARNING ] [GMI  ] Token being retransmitted.
> sendmsg failed errno == 22
> Oct 29 10:08:13 [WARNING ] [GMI  ] The network interface is down.
> Oct 29 10:08:13 [WARNING ] [GMI  ] Token loss in OPERATIONAL.
> Oct 29 10:08:13 [NOTICE  ] [GMI  ] entering GATHER state.
> Oct 29 10:08:13 [NOTICE  ] [GMI  ] SENDING attempt join because this node is ring rep.
> memb_state_gather_enter: res = -1 errno = 22
> mjsend: res = -1, errno = 22
> Oct 29 10:08:14 [NOTICE  ] [GMI  ] I am the only member.
> Oct 29 10:08:14 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
> Oct 29 10:08:14 [NOTICE  ] [CLM  ] New Configuration:
> Oct 29 10:08:14 [NOTICE  ] [CLM  ]      192.168.1.18
> Oct 29 10:08:14 [NOTICE  ] [CLM  ] Members Left:
> Oct 29 10:08:14 [NOTICE  ] [CLM  ]      192.168.1.8
> Oct 29 10:08:14 [NOTICE  ] [CLM  ]      192.168.1.17
> Oct 29 10:08:14 [NOTICE  ] [CLM  ]      192.168.1.19
> Oct 29 10:08:14 [NOTICE  ] [CLM  ] Members Joined:
> Oct 29 10:08:14 [NOTICE  ] [EVT  ] cluster node at 192.168.1.8 down
> Oct 29 10:08:14 [NOTICE  ] [EVT  ] cluster node at 192.168.1.17 down
> Oct 29 10:08:14 [NOTICE  ] [EVT  ] cluster node at 192.168.1.19 down
> Oct 29 10:08:14 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
> Oct 29 10:08:14 [NOTICE  ] [CLM  ] New Configuration:
> Oct 29 10:08:14 [NOTICE  ] [CLM  ]      192.168.1.18
> Oct 29 10:08:14 [NOTICE  ] [CLM  ] Members Left:
> Oct 29 10:08:14 [NOTICE  ] [CLM  ] Members Joined:
> Oct 29 10:08:14 [NOTICE  ] [EVT  ] No channels to send
> otmcast: res = -1, errno = 22
> 
> 
> 
> 
> Oct 29 10:08:26 [WARNING ] [GMI  ] The network interface is now up.
> Oct 29 10:08:26 [NOTICE  ] [GMI  ] entering GATHER state.
> Oct 29 10:08:26 [NOTICE  ] [GMI  ] SENDING attempt join because this node is ring rep.
> memb_state_gather_enter: res = 44 errno = 22
> Oct 29 10:08:26 [NOTICE  ] [GMI  ] I am the only member.
> Oct 29 10:08:26 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
> Oct 29 10:08:26 [NOTICE  ] [CLM  ] New Configuration:
> Oct 29 10:08:26 [NOTICE  ] [CLM  ]      192.168.1.18
> Oct 29 10:08:26 [NOTICE  ] [CLM  ] Members Left:
> Oct 29 10:08:26 [NOTICE  ] [CLM  ] Members Joined:
> Oct 29 10:08:26 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
> Oct 29 10:08:26 [NOTICE  ] [CLM  ] New Configuration:
> Oct 29 10:08:26 [NOTICE  ] [CLM  ]      192.168.1.18
> Oct 29 10:08:26 [NOTICE  ] [CLM  ] Members Left:
> Oct 29 10:08:26 [NOTICE  ] [CLM  ] Members Joined:
> Oct 29 10:08:26 [NOTICE  ] [EVT  ] Already in config change, Starting over, m 1, c 0
> Oct 29 10:08:26 [NOTICE  ] [EVT  ] No channels to send
> otmcast: res = 364, errno = 11
> 
> 
> It looks like something successfully was sent.  But we're not receiving
> it. I'm not sure how the multicasting works, but does the application
> need to register for receiving mcasts?  If so, could we have lost the
> registration when the interface went down?
> 

The multicast does a variety of things which could cause it to fail if
the interface goes down.  This is a behavior change from 2.4, which
doesn't seem to have any negative effects on interface down then up. 
One thing to note in my testing I used ifconfig eth1 down wait 5 seconds
ifconfig eth1 up not ifdown and ifup.  Would you try ifconfig to see if
it does anything differently?

The first thing that is done is that a multicast address is bound to. 
The local address is bound to receive the token.  Also, two sockets are
bound to the interface address specified in bindnet.

I am going to work up a patch to redo this operation of creating and
binding these sockets (its mostly abstracted so it shouldn't be too
difficult) and try it out on 2.6.  I'll let you know what I find.

Thanks for your help Mark
-steve

> Mark.
> 
> 




More information about the Openais mailing list