[Openais] Re: defect 169 fixed up (and 172)
Mark Haverkamp
markh at osdl.org
Thu Oct 28 10:40:10 PDT 2004
On Thu, 2004-10-28 at 07:24 -0700, Mark Haverkamp wrote:
> On Wed, 2004-10-27 at 17:37 -0700, Steven Dake wrote:
> > Did openais print out
> > "the interface is now up" ? If so, then there is some problem with the
> > membership algo.
>
> I did not see that.
Now that I think about it, I didn't see the interface is down message
either.
I interrupted the asiexec program when it appeared to be stuck and did a
stack trace. It was waiting in poll and the timeout was -1. I see that
the the sendmsg in timer_function_token_retransmit_timeout checks for
ENETUNREACH but I'm seeing EINVAL. For an experiment I just checked
for -1 return from sendmsg and then called netif_down_check, I see:
Oct 28 10:13:31 [WARNING ] [GMI ] Token being retransmitted.
sendmsg failed errno == 22
Oct 28 10:13:31 [WARNING ] [GMI ] The network interface is down.
Oct 28 10:13:31 [WARNING ] [GMI ] The network interface is now up.
Oct 28 10:13:31 [NOTICE ] [GMI ] entering GATHER state.
Oct 28 10:13:31 [NOTICE ] [GMI ] No members sent join, keeping old ring and transitioning to operational.
Oct 28 10:13:32 [WARNING ] [GMI ] Token loss in OPERATIONAL.
Oct 28 10:13:32 [NOTICE ] [GMI ] entering GATHER state.
Oct 28 10:13:32 [NOTICE ] [GMI ] SENDING attempt join because this node is ring rep.
Oct 28 10:13:32 [NOTICE ] [GMI ] I am the only member.
Oct 28 10:13:32 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Oct 28 10:13:32 [NOTICE ] [CLM ] New Configuration:
Oct 28 10:13:32 [NOTICE ] [CLM ] 192.168.1.18
Oct 28 10:13:32 [NOTICE ] [CLM ] Members Left:
Oct 28 10:13:32 [NOTICE ] [CLM ] 192.168.1.8
Oct 28 10:13:32 [NOTICE ] [CLM ] 192.168.1.17
Oct 28 10:13:32 [NOTICE ] [CLM ] 192.168.1.19
Oct 28 10:13:32 [NOTICE ] [CLM ] Members Joined:
Oct 28 10:13:32 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Oct 28 10:13:32 [NOTICE ] [CLM ] New Configuration:
Oct 28 10:13:32 [NOTICE ] [CLM ] 192.168.1.18
Oct 28 10:13:32 [NOTICE ] [CLM ] Members Left:
Oct 28 10:13:32 [NOTICE ] [CLM ] Members Joined:
I checked the return value from netif_determine as well as interface_up.
interface_up could be indeterminate because it isn't initialized. I
initialized it in netif_determine to zero in case it isn't set
elsewhere.
This worked somewhat better. It does notice that the interface
comes/goes/comes back. But, the config change get stuck on all the
nodes.
I'll continue to try to narrow this down.
Mark.
>
> >
> > If not, then there is some problem with detection of if the interface is
> > up.
> >
> > Does ifconfig show the interface as up after ifup?
>
> I didn't check, but it must be up since if I restart aisexec everything
> is OK.
>
> Mark.
>
> _______________________________________________
> Openais mailing list
> Openais at lists.osdl.org
> http://lists.osdl.org/mailman/listinfo/openais
--
Mark Haverkamp <markh at osdl.org>
More information about the Openais
mailing list