[Openais] [Corosync] Corosync does not retransmit the lost mcast message

Steven Dake sdake at redhat.com
Thu Mar 18 15:33:02 PDT 2010


On Thu, 2010-03-18 at 10:55 -0600, hj lee wrote:
> Hi,
> 
> I had an instance the one of mcast messages was lost. But the corosync
> does not try to retransmit the lost message, so the other node gets
> into "FAILED TO RECEIVE". The logs from both servers are below. The
> srv3 did not receive the mcast message 161. The problem is the srv3
> did not request the retransmission of that lost message.
> 

my analysis of the log data indicates srv3 IP address is 192.168.10.21.
Is that correct?

I full log (attach it) would be helpful to see the events that led up to
the problem.  I especially want to know if totem was in the operational
state or some other state when this happened.
> 
> 2010-03-17 15:22:03.831704 srv3-corosync[6213]:  [TOTEM ]
> totemsrp.c:2094 mcasted message added to pending queue
> 2010-03-17 15:22:03.831719 srv3-corosync[6213]:  [TOTEM ]
> totemsrp.c:3580 Delivering 160 to 162
> 2010-03-17 15:22:03.831727 srv3-corosync[6213]:  [TOTEM ]
> totemsrp.c:3747 Received ringid(192.168.10.21:1532) seq 162
> 2010-03-17 15:22:03.831734 srv3-corosync[6213]:  [TOTEM ]
> totemsrp.c:3580 Delivering 160 to 162
> 
> 2010-03-17 15:22:03.204757 srv4-corosync[22981]:  [TOTEM ]
> totemsrp.c:3747 Received ringid(192.168.10.21:1532) seq 160
> 2010-03-17 15:22:03.204765 srv4-corosync[22981]:  [TOTEM ]
> totemsrp.c:3747 Received ringid(192.168.10.21:1532) seq 161
> 2010-03-17 15:22:03.205069 srv4-corosync[22981]:  [TOTEM ]
> totemsrp.c:2217 releasing messages up to and including 160
> 2010-03-17 15:22:03.828871 srv4-corosync[22981]:  [TOTEM ]
> totemsrp.c:3747 Received ringid(192.168.10.21:1532) seq 162
> 2010-03-17 15:22:03.828884 srv4-corosync[22981]:  [TOTEM ]
> totemsrp.c:3580 Delivering 161 to 162
> 2010-03-17 15:22:03.828892 srv4-corosync[22981]:  [TOTEM ]
> totemsrp.c:3650 Delivering MCAST message with seq 162 to pending
> delivery queue
> 2010-03-17 15:22:03.859675 srv4-corosync[22981]:  [TOTEM ]
> totemsrp.c:3442 FAILED TO RECEIVE
> 2010-03-17 15:22:03.859689 srv4-corosync[22981]:  [TOTEM ]
> totemsrp.c:1102 Set consensus for 22/192.168.10.22 at 0 found 0
> 2010-03-17 15:22:03.859696 srv4-corosync[22981]:  [TOTEM ]
> totemsrp.c:1795 entering GATHER state from 6.
> 
> -- 
> Peakpoint Service
> 
> Cluster Setup, Troubleshooting & Development
> kerdosa at gmail.com
> (303) 997-2823
> _______________________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/openais



More information about the Openais mailing list