[Bridge] how to handle bonding failover when using a bridge over the bond?
chris.friesen at genband.com
Wed Feb 13 17:14:00 UTC 2013
On 02/12/2013 06:30 PM, Chris Friesen wrote:
> On 02/12/2013 06:02 PM, Jay Vosburgh wrote:
>> Chris Friesen<chris.friesen at genband.com> wrote:
>>> I have a physical host with two ethernet links that are bonded
>>> together (active/backup). Each link is connected to a separate L2
>>> switch, which are in turn connected with a crosslink for
>>> The physical host is running multiple virtual machines each with
>>> a virtual adapter. The virtual adapters and the bond are all
>>> bridged together to allow communication between the virtual
>>> machines, the host, and the outside world.
>>> Now suppose one of the slave links fails. The bond device will
>>> failover to the other slave and send out a gratuitous arp on the
>>> newly active slave. This will cause the L2 switches to update
>>> their lookup tables for the MAC address associated with the bond
>>> (so it now points to the newly active slave), but doesn't update
>>> the MAC addresses associated with the various virtual machines.
>>> If someone on the network sends a packet to one of the virtual
>>> machines, the switch will try to send it over the failed slave.
>> If the link failure is such that there is no carrier on the switch
>> port, the switch will drop the forwarding entry for the virtual
>> machine's MAC address from that port. The traffic for the VM's MAC
>> would then flood to all ports, presumably including the link to
>> the other switch, which wouldn't have a forwarding entry for the
>> MAC, either (or it would be the switch link port), and would also
>> flood it to all ports, one of which is the correct one.
I talked with our networking guy. Apparently what is happening is that
if we pull the link to switch A it drops the forwarding entries for all
MACs on the downed link, but switch B still has stale entries pointing
to the inter-switch link.
If a packet destined for the VM that arrives at switch B, it will send
it across to switch A. (Which is pointless since A no longer has a
working link to the MAC in question.)
If a packet destined for the VM that arrives at switch A, it will
broadcast it to all ports, including the inter-switch link to switch B.
However, switch B still thinks the MAC address is connected to switch
A, so it drops the packet.
Once the VMs send out packets switch B will update its tables, but if
the VMs are event-driven and mostly only respond to incoming packets
they could end up waiting a long time.
More information about the Bridge