[Openais] Re: defect 169 fixed up (and 172)

Mark Haverkamp markh at osdl.org
Thu Oct 28 10:40:10 PDT 2004


On Thu, 2004-10-28 at 07:24 -0700, Mark Haverkamp wrote:
> On Wed, 2004-10-27 at 17:37 -0700, Steven Dake wrote:
> > Did openais print out
> > "the interface is now up" ?  If so, then there is some problem with the
> > membership algo. 
> 
> I did not see that.

Now that I think about it, I didn't see the interface is down message
either.

I interrupted the asiexec program when it appeared to be stuck and did a
stack trace.  It was waiting in poll and the timeout was -1.  I see that
the the sendmsg in timer_function_token_retransmit_timeout checks for
ENETUNREACH  but I'm seeing EINVAL.  For an experiment I just checked
for  -1 return from sendmsg and then called netif_down_check, I see:

Oct 28 10:13:31 [WARNING ] [GMI  ] Token being retransmitted.
sendmsg failed errno == 22
Oct 28 10:13:31 [WARNING ] [GMI  ] The network interface is down.
Oct 28 10:13:31 [WARNING ] [GMI  ] The network interface is now up.
Oct 28 10:13:31 [NOTICE  ] [GMI  ] entering GATHER state.
Oct 28 10:13:31 [NOTICE  ] [GMI  ] No members sent join, keeping old ring and transitioning to operational.
Oct 28 10:13:32 [WARNING ] [GMI  ] Token loss in OPERATIONAL.
Oct 28 10:13:32 [NOTICE  ] [GMI  ] entering GATHER state.
Oct 28 10:13:32 [NOTICE  ] [GMI  ] SENDING attempt join because this node is ring rep.
Oct 28 10:13:32 [NOTICE  ] [GMI  ] I am the only member.
Oct 28 10:13:32 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
Oct 28 10:13:32 [NOTICE  ] [CLM  ] New Configuration:
Oct 28 10:13:32 [NOTICE  ] [CLM  ]      192.168.1.18
Oct 28 10:13:32 [NOTICE  ] [CLM  ] Members Left:
Oct 28 10:13:32 [NOTICE  ] [CLM  ]      192.168.1.8
Oct 28 10:13:32 [NOTICE  ] [CLM  ]      192.168.1.17
Oct 28 10:13:32 [NOTICE  ] [CLM  ]      192.168.1.19
Oct 28 10:13:32 [NOTICE  ] [CLM  ] Members Joined:
Oct 28 10:13:32 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
Oct 28 10:13:32 [NOTICE  ] [CLM  ] New Configuration:
Oct 28 10:13:32 [NOTICE  ] [CLM  ]      192.168.1.18
Oct 28 10:13:32 [NOTICE  ] [CLM  ] Members Left:
Oct 28 10:13:32 [NOTICE  ] [CLM  ] Members Joined:


I checked the return value from netif_determine as well as interface_up.
interface_up could be indeterminate because it isn't initialized.  I
initialized it in netif_determine to zero in case it isn't set
elsewhere.

This worked somewhat better.  It does notice that the interface
comes/goes/comes back.  But, the config change get stuck on all the
nodes.

I'll continue to try to narrow this down.

Mark.


> 
> > 
> > If not, then there is some problem with detection of if the interface is
> > up.  
> > 
> > Does ifconfig show the interface as up after ifup?
> 
> I didn't check, but it must be up since if I restart aisexec everything
> is OK.
> 
> Mark.
> 
> _______________________________________________
> Openais mailing list
> Openais at lists.osdl.org
> http://lists.osdl.org/mailman/listinfo/openais
-- 
Mark Haverkamp <markh at osdl.org>




More information about the Openais mailing list