[Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP

Tue Jan 28 09:58:23 UTC 2020

On 27.01.2020 15:39, Jürgen Lambrecht wrote:
>EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
>On 1/27/20 1:27 PM, Allan W. Nielsen wrote:
>> Hi Jürgen,
>>
>> On 27.01.2020 12:29, Jürgen Lambrecht wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>
>>> On 1/26/20 4:59 PM, Andrew Lunn wrote:
>>>> Given the design of the protocol, if the hardware decides the OS etc
>>>> is dead, it should stop sending MRP_TEST frames and unblock the ports.
>>>> If then becomes a 'dumb switch', and for a short time there will be a
>>>> broadcast storm. Hopefully one of the other nodes will then take over
>>>> the role and block a port.
>This can probably be a configuration option in the hardware, how to fall-back.
>>
>>> In my experience a closed loop should never happen. It can make
>>> software crash and give other problems.  An other node should first
>>> take over before unblocking the ring ports. (If this is possible - I
>>> only follow this discussion halfly)
>>>
>>> What is your opinion?
>> Having loops in the network is never a good thing - but to be honest, I
>> think it is more important that we ensure the design can survive and
>> recover from loops.
>Indeed
>>
>> With the current design, it will be really hard to void loops when the
>> network boot. MRP will actually start with the ports blocked, but they
>> will be unblocked in the period from when the bridge is created and
>> until MRP is enabled. If we want to change this (which I'm not too keen
>> on), then we need to be able to block the ports while the bridge is
>> down.
>Our ring network is part of a bigger network. Loops are really not allowed.
That is understood, and should be avoided. But I assume that switches
which crashes is not allowed either ;-)

We will consider if we somehow can block the ports before/after a
user-space protocol kicks in. I can not promise anything, but we will
see what can be done.

>> And even if we do this, then we can not guarantee to avoid loops. Lets
>> assume we have a small ring with just 2 nodes: a MRM and a MRC. Lets
>> assume the MRM boots first. It will unblock both ports as the ring is
>> open. Now the MRC boots, and make the ring closed, and create a loop.
>> This will take some time (milliseconds) before the MRM notice this and
>> block one of the ports.
>In my view there is a bring-up and tear-down module needed. I don't
>know if it should be part of MRP or not? Probably not, so something on
>top of the mrp daemon.
If we need this kind of policies, then I agree it should be on top of or
out-side the user-space MRP daemon.

>> But while we are at this topic, we need to add some functionality to
>> the user-space application such that it can set the priority of the MRP
>> frames. We will get that fixed.
>Indeed! In my old design I had to give high priority, else the loop was
>wrongly closed at high network load.
Yes, I'm not surprised to hear that.

>I guess you mean the priority in the VLAN header?
>I think to remember one talked about the bride code being VLAN-agnostic.
Yes, if it has a VLAN header (which is optional). But even without the
VLAN header these frames needs to be classified to a high priority
queue.

>>> (FYI: I made that mistake once doing a proof-of-concept ring design:
>>> during testing, when a "broken" Ethernet cable was "fixed" I had for a
>>> short time a loop, and then it happened often that that port of the
>>> (Marvell 88E6063) switch was blocked.  (To unblock, only solution was
>>> to bring that port down and up again, and then all "lost" packets came
>>> out in a burst.) That problem was caused by flow control (with pause
>>> frames), and disabling flow control fixed it, but flow-control is
>>> default on as far as I know.)
>> I see. It could be fun to see if what we have proposed so far will with
>> with such a switch.
>
>Depending on the projects I could work on it later this year (or only next year or not..)
Sounds good - no hurry.

/Allan