[Bridge] [PATCH net-next v2] bridge: vlan: allow to suppress local mac install for all vlans

Nikolay Aleksandrov nikolay at cumulusnetworks.com
Sun Sep 13 13:22:51 UTC 2015

On 08/29/2015 03:11 AM, Vlad Yasevich wrote:
> On 08/28/2015 11:26 AM, Nikolay Aleksandrov wrote:
>>> On Aug 28, 2015, at 5:31 AM, Vlad Yasevich <vyasevic at redhat.com> wrote:
>>> On 08/27/2015 10:17 PM, Nikolay Aleksandrov wrote:
>>> I don't remember learning being all that complicated.  The hash only changed under
>>> rtnl when vlans were added/removed.  The nice this is that we wouldn't need
>>> to rebalance, because if the vlan is removed all fdb links get removed too.  They
>>> don't move to another bucket (But that was with static hash.  Need to look at rhash in
>>> more detail).
>>> If you want, I might still have patches hanging around on my machine that had a hash
>>> table implementation.  I can send them to you.
>>> -vlad
>> :-) Okay, I’m putting the crystal ball away. If you could send me these patches it’d be great so
>> I don’t have to start this from scratch.
> So, I forgot that I lost an old disk that had all that code, so I am a bit bummed about
> that.  I did however find the series that got posted.
> http://www.spinics.net/lists/netdev/msg219737.html
> That was the series where I briefly switch from bitmaps to hash and list.
> However, I see that the fdb code that was playing with never got posted...
> Sorry.
> -vlad

So I've been looking into this for some time now and did a basic implementation of vlan handling
using rhashtables, here are some thoughts and a slightly different proposition.
First a few scenarios (the memory footprint is only the extra memory needed for the
Current memory footprint for 48 ports & 2000 vlans ~ 50k

1. Bridge with vlan hash with port bitmaps (similar to Vlad's first set)
- On input we have hash lookup + bitmap lookup
- If (r)hashtable is used we need additional list to handle stable list walks which are
needed all over the place from error handling to compressed vlan dumps which actually
need this list to be kept sorted since the already exposed user interfaces need to
be handled without visible changes, but they also allow for per-port vlan compressed
dumping which isn't easy to handle. Mostly the stability issue with rhashtable
is with resizing since these entries change only under rtnl, also we need the sorted
order because of the compressed dump. One alternative way to solve this is to build the
sorted list each time a dump is requested, but again this falls under the workarounds
needed to satisfy current behaviour requirements.
If this is chosen my preference is to have the vlans also in a list which is kept sorted
for the walks, then the compressed request can be satisfied easier.
- memory footprint for 2000 vlans with 48 ports ~ 1.5 MB

2. Bridge with vlan hash, ports with vlan hashes (need a special per-port struct because
of the tagged/untagged case, we basically need per-port per-vlan flags)
- On input we have 1 hash lookup only from the port vlan hash where get a pointer
to the bridge's vlan entry so we get the global vlan context as well as the local
- Same rhashtable handling requirements apply + more complexity & memory due to having
to keep in sync multiple (per-port, per-bridge global) rhashtables
- memory footprint for 2000 vlans with 48 ports ~ 2.6 MB

Up until now I've done partially point 1 to see how much churn it would take and the
basic change is huge. Also the memory footprint increases a lot.
So I'd propose a third option which you may call middle ground between the current
implementation (which is very fast and compact) and points 1 & 2:

What do you think about adding an auxiliary per-vlan global context using rhashtable
which is not used in the ingress/egress decision making ? We can contain it
via either a Kconfig option (so it can be compiled out) or via a dynamic run-time option
so people who would like more features can enabled it on demand and are willing to
trade some performance and memory.
This way we won't have to change most of the current API and won't have to add workarounds
to keep the user-facing behaviour the same, also the syncing is reduced to
a refcount and the memory footprint is kept minimal.
The initial new features I'd like to introduce are per-vlan counters and also per-vlan
flags which at first will be used to enable/disable multicast on a vlan basis.
In terms of performance if this is enabled it is close to point 1 but without the changes
all over the API and more importantly with much less memory footprint.
The memory footprint of this option with 2000 vlans & 48 ports ~ +70k (without the per-cpu
counters, any additional feature will naturally add to this). This is because we don't
have a per-port increase for each vlan added and only keep the global context.

If it's acceptable to take the performance/memory hit and the huge churn, then I can continue
with 1 or 2, but I'm not a big fan of that idea.

Feedback before I go any further on this would be much appreciated.

Thank you,

More information about the Bridge mailing list